The present invention is discussed in the following largely with reference to the medical industry, but the present invention is applicable to a variety of contexts and environments, including those that may utilize multidimensional data, for example, radar, sonar, lidar, X-ray, ultrasound, optical imaging, seismology data, ionosperic tomography, and many others.
Diagnostic imaging has influenced many aspects of modern medicine. The availability of volumetric images from X-ray computed tomography (CT), magnetic resonance (MR), 3-D ultrasound, positron emission tomography (PET) and many other imaging modalities has permitted a new level of understanding of biology, physiology, anatomy to be reached, as well as facilitated studies of complex disease processes. While the ability to acquire new and more sophisticated medical images has developed very rapidly in the past 20 years, the ability to analyze medical imagery is still performed visually and in a qualitative manner. A practicing physician, such as a radiologist seeking to quantitatively analyze volumetric information—for example, to determine a tumor size, quantify its volumetric change from the previous imaging session, measure plaque volume, objectively assess airway reactivity to allergens throughout the lungs, etc.—would largely be able only to manually outline regions of interest in a number of two-dimensional images. Some single-purpose tools have been developed for the quantitative analysis of medical images, their usefulness is limited. The use of these tools is often tedious and time consuming and requires an excessive amount of user interaction. They frequently fail in the presence of disease. Therefore, they are typically unsuitable for routine employment in the clinical care setting. Since volumetric and/or dynamic medical images frequently cannot be reliably analyzed quantitatively using today's tools, a significant and possibly critical portion of the image information cannot be used for clinical diagnostic purposes.
3-D Medical Image Segmentation
It is a standard practice to analyze 3-D medical images as sequences of 2-D image slices forming the 3-D data. There are many essential problems associated with this approach. The most fundamental ones stem from the lack of contextual slice-to-slice information when analyzing sequences of adjacent 2-D images. Performing the segmentation directly in the 3-D space tends to bring more consistent segmentation results, yielding object surfaces instead of sets of individual contours. 3-D image segmentation techniques—for example, techniques known by the terms region growing, level sets, fuzzy connectivity, snakes, balloons, active shape and active appearance models—are known. None of them, however, offers a segmentation solution that achieves optimal results. The desire for optimal segmentation of an organ or a region of pathology, for example, is critical in medical image segmentation.
Reliable tools for automated image segmentation are a necessary prerequisite to quantitative medical image analyses. Medical image segmentation is frequently divided into two stages—localization of the object of interest (to distinguish the object from other objects in the image), and accurate delineation of the object's borders or surfaces. Almost universally known, segmentation methods that perform well on the task of robust object localization usually do not perform well with respect to the accurate border/surface delineation, and vice versa.
One embodiment of the present invention is directed to new optimal image segmentation methods that allow the identification of single surfaces as well as simultaneously allowing the identification of multiple interacting surfaces in 3-D and 4-D medical image datasets. Embodiments in accordance with the present invention using the graph-based approaches to segmentation provide the ability to determine the object boundaries in an optimal fashion, i.e., optimal with respect to a task-specific objective function. Consequently, robust object localization techniques may be used to identify the object of interest in the image data followed by graph construction and optimal graph-search segmentation. The proposed n-D graph search segmentation methods are preferable for the stage-2 task of such a two-stage process.
Segmentation Optimality in 3-D and 4-D
While many of the image segmentation methods may be able to provide a locally optimal solution, they may not be able to provide a globally optimal solution. Such methods either cannot address the globally optimal criterion at all, or compute only an approximate solution that could be arbitrarily far away from the global optimum.
Furthermore, most of the known 3-D segmentation techniques used today are region based—examples include region growing, fuzzy connectivity, and watershed techniques, as would be understood by those with skill in the art. These techniques are frequently iterative and their operation is based on a sequence of locally optimal steps, with no guarantee of achieving global optimality once they converge to a solution. Results of region-based methods are frequently locally incorrect, and the performance of these methods often suffers from the problem of “leaking” into surrounding regions.
The second family of 3-D image segmentation techniques consists of edge-based (boundary-based) methods. Examples include active shape models, snakes and their 3-D extensions, and level sets. All these methods converge to some local optimum and globally optimal solution cannot be guaranteed. As a result, the use of known segmentation methods largely cannot be consistently automated in that they require substantial human supervision and interaction.
While combinations of edge-based and region-based approaches Active Appearance Models (“AAM”) are known and are quite powerful, as with other approaches, the optimization process frequently ends in a local optimum. Additionally, the AAM approach requires that point correspondence be established among the individual instances of the object samples used for training.
Objective Functions
In many segmentation methods, the segmentation behavior is controlled by the objective function that is employed. It is the goal of the segmentation process typically to optimize—that is minimize—the objective function. The objective functions are almost always task specific. Incorporating a priori knowledge reflecting the segmentation goal is a norm. In many cases, the objective function is specified by the human designer. Methods for automated design of objective functions are beginning to appear. In the latter case, the form of the objective function is decided upfront and the objective function parameters are set via machine learning processes. While an infinite number of task specific objective functions can be designed, there is a small number of objective (cost) function forms that are considered sufficiently general so that task-specific cost functions can be derived from them by parameter setup.
An objective function that follows what is known as the Gibbs model mainly reflects the object edge properties. It is frequently used in deformable models and graph searching methods. The terms reflect image data properties like gray level, local texture, edge information, etc., (sometimes called external energy) as well as the resulting border/surface shape or smoothness requirements and hard constraints (internal energy, constraints). A region-based objective function has been proposed by Chan and Vese that is based on region statistics and can yield segmentation in cases when no edges are present on object boundaries. Their objective function is a piecewise constant generalization of the Mumford-Shah functional. A different approach was proposed by Yezzi et al. Their binary model is designed to segment images consisting of two distinct but constant intensity regions and thus attempts to maximize the distance between the average gray levels of the objects and the background. A binary variance model was proposed based on image variances. The extension of these objective functions to 3-D was presented.
Previous Graph-Based Approaches to Image Segmentation
Graph-based approaches have been playing an important role in image segmentation in the past years. The common theme of these approaches is the formation of a weighted graph in which each vertex is associated with an image pixel and each graph edge has a weight relative to the corresponding pixels of that edge to belong to the same object. The resulting graph is partitioned into components in a way that optimizes some specified criteria of the segmentation.
First, a Minimum Spanning Tree (“MST”) of the associated graph is used. Recently, Felzenszwalb and Huttenlocher developed an MST-based technique that adaptively adjusts the segmentation criterion based on the degree of variability in the neighboring regions of the image. Their method attains certain global properties, while making local decisions using the minimum weight edge between two regions in order to measure the difference between them. This approach may be made more robust in order to deal with outliers by using a quantile rather than the minimum edge weight. This solution, however, makes the segmentation problem Non-deterministic Polynomial-time hard (NP-hard).
Many 2-D medical image segmentation methods are based on graph searching or use dynamic programming to determine an optimal path through a 2-D graph. Attempts extending these methods to 3-D and making 3-D graph searching practical in medical imaging are known. An approach using standard graph searching principles has been applied to a transformed graph in which standard graph searching for a path was used to define a surface. While the method provided surface optimality, it was at the cost of enormous computational requirements. A heuristic sub-optimal approach to surface detection that was computationally feasible was also developed.
A third class of graph-based segmentation methods is known. It employs minimum graph cut techniques, in which the cut criterion is designed to minimize the similarity between pixels that are to be partitioned. Wu and Leahy were the first to introduce such a cut criterion, but the approach was biased towards finding small components. The bias was addressed later by ratio regions, minimum ratio cycles, and ratio cuts. However, all these techniques are applicable only to 2-D settings. Shi and Malik developed a novel normalized cut criterion for image segmentation, which takes into account the self-similarity of the regions and captures non-local properties of the image. Recently, Weiss showed that Shi and Malik's eigen vector-based approximation is related to the more standard spectral partitioning methods on graphs. However, all such approaches are computationally too expensive for many practical applications. Ishikawa and Geiger formulated an image segmentation problem as a class of Markov Random Field (“MRF”) models. Yet, this method applies only if the pixel labels are one-dimensional and their energies are not discontinuity preserving.
An energy minimization framework using minimum s-t cuts was established by Boykov et al. and Kolmogorov et al. They considered non-convex smooth priors and developed efficient heuristic algorithms for minimizing the energy functions. Several medical image segmentation techniques based on this framework were developed by Boykov et al. and Kim et al. The cost function employed in their work follows the “Gibbs model” given in: ε(ƒ)=εdata(ƒ)+εsmooth(ƒ). For certain forms of smooth priors, Kolmogorov et al. applied minimum s-t cuts to minimize ε(ƒ).
Recently, Boykov developed an interactive segmentation algorithm for n-D images based on minimum s-t cuts, which is further improved. The cost function used is general enough to include both the region and boundary properties of the objects. While the approach by Boykov is flexible and shares some similarities with the level set methods, it needs the selection of object and background seed points that is difficult to achieve for many applications. Additionally, without taking advantage of the prior shape knowledge of the objects to be segmented, the results are topology-unconstrained and may be sensitive to initial seed point selections.
Segmentation of Mutually Interacting Surfaces
In medical imaging, many surfaces that need to be identified appear in mutual interactions. These surfaces are “coupled” in a way that their topology and relative positions are usually known already (at least in a general sense), and the distances between them are within some specific range. Incorporating these surface-interrelations into the segmentation can further improve accuracy and robustness, especially when insufficient image-derived information is available for defining some object boundaries or surfaces. Such insufficiency can be remedied by using clues from other related boundaries or surfaces. Simultaneous optimal detection of multiple coupled surfaces thus yields superior results compared to the traditional single-surface detection approaches. Simultaneous segmentation of coupled surfaces in volumetric medical images is an under explored topic, especially when more than two surfaces are involved.
Several methods for detecting coupled surfaces have been proposed in recent years. None of them, however, guarantees a globally optimal solution. The Active Shape Model (“ASM”) and Active Appearance Models (“AAM”) implicitly take into account the geometric relations between surfaces due to the statistical shape constraints. The frequently used iterative gradient descent methods may end at a local optimum. The method is essentially 2-D and needs a precise manual initialization. Other methods are based on coupled parametric deformable models with self-intersection avoidance, which requires a complex objective function and is computationally expensive. Still other methods utilize level-set formulations that can take advantage of efficient time-implicit numerical schemes. They are, unfortunately, not topology-preserving. Further, the local boundary-based formulation can be trapped in a local minimum that is arbitrarily far away from the global optimum. While the introduction of a weighted balloon-force term may alleviate this difficulty, it exposes the model to a “leaking” problem. Finally, the feasibility of extending these methods to handling more than two surfaces is unverified.