Cardiac computed tomography (CT) is an important imaging modality for diagnosing cardiovascular disease and it can provide detailed anatomic information about the cardiac chambers, large vessels or coronary arteries. Multi-chamber heart segmentation is a prerequisite for global quantification of the cardiac function. The complexity of cardiac anatomy, poor contrast, noise or motion artifacts makes this segmentation a challenging task. Most known approaches focus on only left ventricle segmentation. Complete segmentation of all four heart chambers can help to diagnose diseases in other chambers, e.g., left atrium fibrillation, right ventricle overload or to perform dyssynchrony analysis.
There are two tasks for a non-rigid object segmentation problem: object localization and boundary delineation. Most of the known approaches focus on boundary delineation based on active shape models, active appearance models, and deformable models. There are a few limitations inherent in these techniques: 1) Most of them are semiautomatic and manual labeling of a rough position and pose of the heart chambers is needed. 2) They are likely to get stuck in local strong image evidence. Other known techniques are straightforward extensions of two dimensional (2D) image segmentation to three dimensional (3D) image segmentation. The segmentation is performed on each 2D slice and the results are combined to get the final 3D segmentation. However, such techniques cannot fully exploit the benefit of 3D imaging in a natural way.
Object localization is required for an automatic segmentation system and discriminative learning approaches have proved to be efficient and robust for solving 2D problems. In these methods, shape detection or localization is formulated as a classification problem: whether an image block contains the target shape or not. To build a robust system, a classifier only has to tolerate limited variation in object pose. The object is found by scanning the classifier over an exhaustive range of possible locations, orientations, scales or other parameters in an image. This searching strategy is different from other parameter estimation approaches, such as deformable models, where an initial estimate is adjusted (e.g., using the gradient descent technique) to optimize a predefined objective function.
Exhaustive searching makes the system robust under local minima, however there are two challenges to extend the learning based approaches to 3D. First, the number of hypotheses increases exponentially with respect to the dimensionality of the parameter space. For example, there are nine degrees of freedom for the anisotropic similarity transformation, namely three translation parameters, three rotation angles and three scales. If n discrete values are searched for each dimension, the number of tested hypotheses is n9 (for a very coarse estimation with a small n=5, n9=1,953,125). The computational demands are beyond the capabilities of current desktop computers. Due to this limitation, previous approaches often constrain the search to a lower dimensional space. For example, only the position and isotropic scaling (4D) is searched in the generalized Hough transformation based approach. The second challenge is that efficient features are needed to search the orientation and scale spaces. Haar wavelet features can be efficiently computed for translation and scale transformations. However, when searching for rotation parameters, one either has to rotate the feature templates or rotate the volume which is very time consuming. The efficiency of image feature computation becomes more important when combined with a very large number of test hypotheses. There is a need for an approach for detecting shapes in high dimensional images that is efficient and less computationally intensive.