Segmentation of anatomical structures has been traditionally formulated as a perceptual grouping task and solved through clustering and variational approaches. However, such strategies require a priori knowledge to be explicitly defined in the optimization criterion (e.g., “high-gradient border”, “smoothness” or “similar intensity or texture”). These approaches are limited by the validity of the underlying assumptions and cannot capture complex structure appearance.
Accurate localization of complex structures is important in many computer vision applications ranging from facial feature detection to segmentation of anatomical structures in medical images or volumes. Availability of large databases with expert annotation of the interest structures makes a learning approach more attractive than classical approaches of solving perceptual grouping tasks through clustering or variational formulations. This is especially important when the underlying image structure does not have clear border definition, show complex appearance with large amounts of noise, or when there is a relatively large variation between expert's own annotations.
The difficulty of the segmentation task is illustrated in FIG. 1. FIG. 1 shows ultrasound images of the heart in which the left ventricle border or endocardium is to be delineated. Automated segmentation of echocardiographic images has proved to be challenging due to large amounts of noise, signal drop-out and large variations between appearance, configuration and shape of the left ventricle. Also as can be seen from these images, the shape and appearance varies from image to image.
Segmentation is one of the most important low level image processing methods and has been traditionally approached as a grouping task based on some homogeneity assumption. For example, clustering methods have been used to group regions based on color similarity or graph partitioning methods have been used to infer global regions with coherent brightness, color and texture. Alternatively, the segmentation problem can be cast in an optimization framework as the minimization of some energy function. Concepts such as “high-gradient border”, “smoothness” or “similar intensity or texture” are encoded as region or boundary functionality in the energy function and minimized through variational approaches.
However, as the complexity of targeted segmentation increases, it is more difficult to encode prior knowledge into the grouping task. Learning has become more important for segmentation and there are methods that infer rules for the grouping process that are conditioned by the user input.
In a different known approach, active appearance models use registration to infer the shape associated with the current image. However, modeling assumes a Gaussian distribution of the joint shape-texture space and requires initialization close to the final solution. Alternatively, characteristic points can be detected in the input image by learning a classifier through boosting. There is a need for a method which directly exploits expert annotation of the interest structure in large databases by formulating the segmentation as a learning problem.