It is very common during medical examinations for medical imaging systems (e.g., ultrasound imaging systems) to be used for the detection and diagnosis of abnormalities associated with anatomical structures (e.g., organs such as the heart). Many times, the images are evaluated by a medical expert (e.g., a physician or medical technician) who is trained to recognize characteristics in the images which could indicate an abnormality associated with the anatomical structure or a healthy anatomical structure.
Because of the advancements in computer technology, most computers can easily process large amounts of data and perform extensive computations that can enhance the quality of the obtained images. Furthermore, image processing can be used as a tool to assist in the analysis of the images. Efficient detection of anatomical structures or objects of interest in an image is an important tool in the further analysis of that structure. Many times abnormalities in the shape of an anatomical structure or changes of such a shape through time (e.g., a beating heart or a breathing lung) indicate a tumor or various diseases (e.g., dilation or ischemia of the heart muscle).
This type of image processing can be used for other applications such as the detection of human faces in an image. Because of the variables associated with different facial features (e.g., hair color and length, eye color, facial shape, etc.), facial detection is not a trivial task. Face detection can be used in a variety of applications such as user recognition, surveillance or security applications.
Various types of approaches have been used to detect objects of interest (e.g., anatomical structures or faces). Component-based object detectors (eye detector and mouth detector, etc.) can deal with large variations in pose and illumination, and are more robust under occlusions and heteroscedastic noise. For example, in echocardiogram analysis, local appearance of the same anatomical structure (e.g., the septum) is similar across patients, while the configuration or shape of the heart can be dramatically different due to, for example, viewing angles or disease conditions. Likewise, in face detection, general spatial relationships between facial features are fairly consistent (e.g., general location of eyes to nose and mouth), while the configuration and shape of the various facial features (e.g., shape of eyes, expression of mouth, and relative distances among them) can vary significantly.
For capturing local appearance variations, many solutions rely on a Gaussian assumption. Recently, this assumption has been relaxed through the use of nonlinear learning machines such as Support Vector Machines (SVM) or boosting. Some of the most successful real-time object detection methods are based on boosted cascade of simple features. By combining the response of a selected number of simple classifiers through boosting, the resulting strong classifier is able to achieve high detection rates and is capable of processing images in real-time. However, the existing methods do not address the detection problem under the presence of occluding objects. The erroneous response of a simple or weak classifier due to occlusion will negatively influence the detection outcome.
For most visual tracking applications, measurement data are uncertain and sometimes missing: images are taken with noise and distortion, while occlusions can render part of the object-of-interest unobservable. Uncertainty can be globally uniform; but in most real-world scenarios, it is heteroscedastic in nature, i.e., both anisotropic and inhomogeneous. A good example is the echocardiogram (ultrasound heart data). Ultrasound is prone to reflection artifacts, e.g., specular reflectors, such as those that come from membranes. Because of the single “view direction”, the perpendicular surface of a specular structure produces strong echoes, but tilted or “off-axis” surfaces may produce weak echoes, or no echoes at all (acoustic “drop out”). For an echocardiogram, the drop-out can occur at the area of the heart where the tissue surface is parallel to the ultrasound beam.
Due to its availability, relative low cost, and noninvasiveness, cardiac ultrasound images are widely used for assessing cardiac functions. In particular, the analysis of ventricle motion is an efficient way to evaluate the degree of ischemia and infarction. Segmentation or detection of the endocardium wall is the first step towards quantification of elasticity and contractility of the left ventricle. Examples of some existing methods include pixel-based segmentation/clustering approaches (e.g., Color Kinesis), variants of optical flow, deformable templates and Markov random process/fields, and active contours/snakes. Some methods are employed in 2-Dimensional, 3-Dimensional or 4-Dimensional (3D+time) space.
However, most existing segmentation or detection methods do not attempt to recover accurate regional motions of the endocardial wall, and in most cases, motion components along the wall are ignored. This simplified treatment is also employed by contour trackers that search only along the normals of the current contour. This is not suitable for regional wall abnormality detection, because regional motion of an abnormal left ventricle is likely to be off the normal of the contour, not to mention that global motion, such as translation or rotation (due to the sonographer's hand motion or respiratory motion the patient), causes off-normal local motion on the contour as well. It is desirable to track the global shape of endocardial wall as well as its local motion, for the detection of regional wall motion abnormalities. This information can be used for further diagnosis of ischemia and infarction. There is a need for a detection framework that matches anatomical structures using appearance and shape.