As is known in the art, digital image processing is becoming increasingly popular as digital imaging devices continue to become more powerful. For example, digital cameras can generate pictures having 10 million pixels, and Computed Tomography (CT) scanners may produce volume data having more than 100 million voxels. Processing these images places a large computational burden on the various devices that perform image processing.
One type of processing that is often performed on image data is segmentation, whereby a boundary is determined between different portions of the image. For example, in digital photography, it is often desirable to define a boundary between a main object (i.e., foreground) and background, in order to segment out the main object. After the main object is segmented, the main object and background may be processed separately. Similarly, in the medical imaging field, it is often desirable to segment out a particular object, or portion of an object, from a CT scan image. For example, in the case of a CT scan of a human heart, it may be desirable to segment out a portion of the heart (e.g., left atrium) in order to allow a physician to more easily analyze the image. One example of segmentation is illustrated in FIG. 1 which shows an image 100. Assume that the object of interest is a human heart 102 with the remaining portion of the image being considered background. A desirable segmentation is one which provides a boundary between the object of interest, here the heart 102, and the remaining background portion. Such a boundary is shown in FIG. 1 as dotted line 104. Thus, an appropriate segmentation process would generate boundary 104 between the object of interest (i.e., foreground) 102 and the background.
One well known technique for image segmentation is the use of graph cuts, as described in Y. Boykov and M. Jolly, Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in N-D Images, Proceedings of International Conference on Computer Vision, Vol. 1, July 2001, Vancouver, Canada, pp 105-112. As will be described in further detail below, the graph cuts technique is an interactive segmentation technique that divides an image into two segments, an object and background. A user imposes constraints for the segmentation by indicating by seeds certain pixels that are part of the object and certain pixels that are part of the background. The image is then automatically segmented using graph cuts to find the globally optimal segmentation of the image. It is to be noted that the generic term “image” used herein is intended as pertaining to data volumes as well.
More particularly, the Graph Cuts method performs at interactive speeds for smaller images/volumes, but an unacceptable amount of computation time is required for the large images/volumes common in medical applications. The Graph Cuts algorithm inputs two user-defined “seeds” (or seed groups) indicating samples of the foreground object and the background. The algorithm then proceeds to find the max-flow/min-cut between these seeds, viewing the image as a graph where the pixels are associated with nodes and the edges are weighted to reflect image gradients. Although standard max-flow/min-cut routines may be used for this computation, faster specialty algorithms for the domain of image segmentation have been developed, see a paper by Y. Boykov and V. Kolmogorov, “An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision” IEEE PAMI 26(9) 2004. 1124-1137.
Part of the popularity of the Graph Cuts algorithm also stems from the statistical interpretation of the approach as the minimization of a certain Markov Random Field, see papers by: D, Greigm, B. P. Seheult, “Exact maximum a posteroiri estimation for binary images”, Journal of the Royal Statistical Society, Series B 51(2) 1989 271-279; and Boykpv, Y., Veeksler, O., Zabih. R., “A new algorithm for energy minimization with discontinuities: In Pelillo-M., H.E.R. ed “Energy Minimization Methods in Computer Vision and Pattern Recognition. Second International Workshop, EMMCVPR'99, York, UK, Jul. 25-29, 1999 (1999), 205-270. Additionally, the segmentation results are straightforward to predict and intuitive to use since the algorithm always finds the segmentation boundary at the location of the minimum cut (or surface).
The max-flow/min-cut algorithm described in the above referenced paper by Y. Boykov and V. Kolmogorov, “An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision” IEEE PAMI 26(9) 2004. 1124-1137 makes a great computational advance over the conventional methods for computing max-flow/min-cut. However, for high-resolution images or medical volumes, the computational burden is still too large for the algorithm to operate at an interactive speed. For example, it has been reported that over six minutes were required to segment a medical volume of size 256×256×185. This large computational burden for the Graph Cuts algorithm prompted the introduction of the Banded Graph Cuts (BGC) algorithm described below.
More particularly, a heuristic referred to as Banded Graph Cuts (BGC) has recently been introduced for producing fast, low-memory approximations to graph cuts, see a paper by H. Lombaert, Y Sun, L. Grady and C. Xu, entitled “A multilevel banded graph cuts method for fast image segmentation”, published in Proceedings of ICCV 2005, volume I, pages 259-265, Beijing, China, 2005, IEEE 1.23 and published U.S. Patent Application Pub. No. 2006/015934 A1, published Jul. 20, 2006, application Ser. No. 11/313,102 filed Dec. 20, 2005, entitled “Multilevel Image Segmentation”, inventors Yiyong Sun, Herve Lombaert, Leo Grady and Chenyang Xu, assigned to the same assignee as the present patent application, the entire subject matter thereof being incorporated herein by reference. The BGC method produces fast, low-memory approximations to graph cuts wherein a hierarchical image pyramid is constructed by repetitively downsampling (i.e., reduced resolution or coarsening of) the original, highest resolution image a specified number of times. Foreground (i.e., object of interest) and background seeds on the original high-resolution image are also downsampled. In the smallest image (the one width the least resolution and hence fewest pixels), a full graph cut algorithm is run, and the resulting segmentation is iteratively propagated toward the fine-level resolution using any number of standard, well-known methods. In general, each pixel of the coarse-level image is identified with several pixels on the fine-level image which are all given the same segmentation label as the associated coarse-level pixel. Since the segmentation is performed with the fewest number pixels, the computation time is reduced compared to segmentation with the original highest-resolution image. For each level of downsampling, the boundary of the segmentation is projected onto the next, fine-resolution, level while being dilated (i.e., the original boundary is converted to a “band”) by a specified amount d to create a band around the object. The graph cut algorithm is then run for this sparse (i.e., downsampled) graph. Hence, a tradeoff between the memory/time requirements and segmentation accuracy (i.e., inaccuracies resulting from the processing of downscaled data) is introduced through specification of the maximum pyramid level (i.e., the number of downscalings) and dilation size. Although this heuristic greatly reduces the memory and time consumption, in some applications it is desirable to detect thin structures (i.e., objects) such as vessels in medical imaging data.
Thus, the goal of the BGC algorithm was to provide the same cut (segmentation) of the native Graph Cuts algorithm by introducing a multilevel scheme for the computation. Although the BGC algorithm is not guaranteed to provide the minimum graph cut, it was convincingly shown that the (near) minimal graph cut was returned for practical problems in which the segmentation target was “blob-like”. Despite this limitation to “blob-like” segmentation targets, the BGC algorithm produces computational speeds over an order of magnitude faster than those obtained when employing conventional Graph Cuts (i.e., using the algorithm of described in the above referenced paper by Y. Boykov and V. Kolmogorov, “An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision” IEEE PAMI 26(9) 2004. 1124-1137 on the full graph.
The BGC algorithm is not the only approach to increasing the computational efficiency of Graph Cuts. The Lazy Snapping algorithm described in a paper by Li, Y. et. al. “Proceedings of ACM SIGGRAPH 2004 (2004) 303-308 performs a pre-segmentation grouping of the pixels using a watershed transform described by Roerdink et al., “The watershed transformation: Definitions, algorithms, and parallelization strategies”, Fund. Information 41 (2002) 187-228 and treats each watershed basin as a supernode. Graph Cuts is then run on the graph of supernodes. By drastically reducing the number of nodes in the graph, this approach greatly increases the speed of Graph Cuts in practice. However, the computational savings are more difficult to predict with this approach since the number of supernodes is highly image dependent. Additionally, the authors of this approach offer no characterization of when to expect this coarsening approach to succeed or fail to produce the same cut given by full-resolution Graph Cuts.
The inventors have recognized that a modification of the BGC algorithm will allow for the accurate segmentation (relative to full-resolution Graph Cuts) of thin-objects while preserving the computational advantages of BGC. The modification is born of the inventor's observation that thin structures are lost as a result of the coarsening operation involved in BGC. This modification is based upon the use of a difference image when coarsening to identify thin structures that may be included into the band during the segmentation. More particularly, the inventors incorporate a Laplacian pyramid described in as paper by Burt, P. J., Adelson, E. H entitled “The Laplacian pyramid as a compact image code. IEEE Transactions on Communications” published in COM-31(4) (1983) 532-540 to recover this lost information and extend the band such that high-threshold pixels are included. In this manner, the additional computational burden is slight but thin structures are recovered. The extra computational burden comes only from considering a small number of additional pixels in the band and by computation of the Laplacian pyramid, which need not be computed explicitly and therefore requires simply an additional linear-time pass over the image.
The invention is an improvement in the BGC algorithm that preserves its computational advantages, but produces segmentations much closer to Graph Cuts when the segmentation target contains thin structures. This modification is based upon the use of a difference image when coarsening to identify thin structures that may be included into the band during the segmentation. The modification of the BGC algorithm allows for the accurate segmentation (relative to full-resolution Graph Cuts) of thin objects while preserving the computational advantages of BGC. The modification is born of the observation that thin structures are lost as a result of the coarsening operation involved in BGC. The invention uses a Laplacian pyramid to recover this lost information and extend the band such that high-threshold pixels are included. In this manner, the additional computational burden is slight but thin structures are recovered. The extra computational burden comes only from considering a small number of additional pixels in the band and by computation of the Laplacian pyramid, which need not be computed explicitly and therefore requires simply an additional linear-time pass over the image.
In accordance with the present invention, a process is provided for segmenting an object of interest from background, the process includes:
1) Obtaining a high-resolution (i.e., original or master) image, of an object of interest disposed in background.
2) Using an outside process (e.g., a user) to partial label the image with some pixels being labeled “object”, other pixels being labeled “background” and still others (most) being unlabeled. This initial partial labeling is referred to as “seeds”.
3) Downsampling (i.e., coarsening or reducing in resolution) the partially labeled high-resolution image, both the image intensities and the partial labeling, to a coarse version of both.
4) Performing a “Graph Cuts” algorithm on the coarsen image to convert this partial labeling on the coarse image into a full labeling of the coarse image. That is, the “Graph Cuts” algorithm inputs the partial labeling and produces a full labeling. That is, the “Graph Cuts” algorithm results in previously unlabeled pixels being labeled as either background or object. 5) Generating from the coarse labeling and coarse image, a new, fine-level (high-resolution) image with all pixels being labeled. However, due to the information lost in the coarsening process, and there is uncertainty about this labeling. So, the process converts this full labeling to a partial labeling in order to reflect this uncertainty by the following process:                A) Any pixels near the point where the labels is switch from object to background (i.e., the boundary) become unlabeled (this is referred to as the “banded” part)        B) A comparison is made between the pixels in the new fine-level image to the corresponding pixels in the original (master) image and anywhere there is a significant discrepancy in the intensity between two corresponding pixels, the labeling associated with the pixel is changed to “unlabeled”. The result is a partially labeled image of high-resolution.        
6) The process runs the Graph Cuts algorithm here to produce a fully labeled, high-resolution image.
It should be understood that the initial partial labeling (the “seeds”) is usually extremely small (i.e., almost all pixels are unlabeled), but the partial labeling at the end (step 6) is very large (i.e., almost all pixels are labeled), so “Graph Cuts” can fill in the missing labels much faster now. However, if there is more than one level (because the first coarsened image is not low-resolution enough), then this process is repeated for each level, i.e., A full labeling is inherited from the coarse level, but it is converted into a partial labeling by removing the labels of pixels that we are unsure about and running “Graph Cuts” again to produce a full labeling at this level.
In one embodiment, a process for segmenting an object of interest from background, comprises: obtaining an original, high-resolution image of an object of interest disposed in background with seeds on the object being labeled as object pixels and seeds on the background being labeled background pixels and with uncertain pixels remaining unlabeled; reducing the resolution of the labeled image; performing graph cuts on the reduced resolution, labeled image; performing segmentation on the reduced resolution, labeled image; producing a band around a fine boundary of the unlabeled pixels; labeling all pixels interior to the band as foreground pixels and pixels exterior of the band as background pixels; subtracting the labeled image from the original image; if the magnitude of any pixel in the subtracted image exceeds a predetermined threshold, the label, if any, of that pixel is removed; performing a graph cut again to label all unlabeled pixels; and repeating the above until a predetermined degree of resolution is achieved in an image segmenting the object from the background.
In accordance with the present invention, a master high-resolution image with an object of interest disposed in a background is identified by placing foreground seeds on the object and background seeds on the background. The master high-resolution image is coarsened (i.e., the resolution thereof is reduced). The object in the coarsened image is segmented in a low-resolution segmentation image to a first approximation due to the coarsening process and with relatively thin features in the object being reduced by the coarsening. The low-resolution segmentation image is increased to a high-resolution image (i.e., upsampled) with a band of unmarked pixels around the object. The resulting image with the object segmented is then upsampled to the resolution level of the master image. The upsampled, image is compared with the master high-resolution image with the relatively thin features in the object in the master high-resolution image being identified as having pixels with a large magnitude difference from the corresponding pixels in upsampled, augmented image, such larger magnitude pixels being added to the upsampled, augmented image to an increased approximation. The process is repeated until the desired segmentation approximation is achieved.
The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.
Like reference symbols in the various drawings indicate like elements.