In image processing, segmentation is the partitioning of a digital image into multiple regions (set of pixels), according to a given criterion and is used in the area of automatic image recognition (e.g., the recognition of buildings or roads from satellite imagery), computer-guided diagnosis and surgery for medical imaging, general bottom-up image analysis for industrial applications, etc. After segmentation, each region is assigned a unique label. Each region consists of a group of connected pixels that have similar data values.
Known segmentation techniques are described in: (1) K. S. Fu and J. K. Mui, “A survey on image segmentation,” Pattern Recognition, Vol. 13, No. 1, pp. 3-16, 1981; (2) R. M. Haralick and L. G. Shapiro, “Survey: image segmentation techniques,” Computer Vision, Graphics, and Image Processing, Vol. 29, No. 1, pp. 100-132, 1985; (3) N. R. Pal and S. K. Pal, “A review on image segmentation techniques,” Pattern Recognition, Vol. 26, No. 9, pp. 1277-1294, 1993; and (4) X. Jin and C. H. Davis, “A genetic image segmentation algorithm with a fuzzy-based evaluation function,” in Proc. IEEE International Conference on Fuzzy Systems, pp. 938-943, St. Louis, Mo., May 25-28, 2003. The contents of those publications are incorporated herein by reference.
Most existing image segmentation algorithms can be roughly divided into the following three categories or their hybrids: (1) feature-space thresholding or clustering, (2) region growing or extraction, and (3) edge or gradient-based approaches.
P. K. Sahoo, S. Soltani and A. K. C. Wong, “A survey of thresholding techniques,” Computer Vision, Graphics, and Image Processing, Vol. 41, pp. 233-260, 1988, incorporated herein by reference, presents a survey of the feature-space thresholding techniques. If there are clear separating modes in the histogram of the feature values, thresholding can effectively segment the image. U.S. Pat. No. 5,903,664 (incorporated herein by reference) describes using a thresholding technique to segment cardiac images. However, in uncontrolled image acquisition environments, such as remote sensing images, problems may be exhibited when only simple gray level thresholding is used.
A. K. Jain and P. J. Flynn, “Image segmentation using clustering,” in Advances in Image Understanding, pp. 65-83, IEEE Computer Society Press, 1996 (incorporated herein by reference) describes a survey of the application of clustering methodology to the image segmentation problem. The modes in the histogram or the clusters in high-dimensional features are found by either supervised or unsupervised classification method. However, segmentation based on clustering may exhibit high computational complexity for many clustering algorithms and may incorrectly rely on strict assumptions (often multivariate Gaussian) about the multidimensional shape of clusters.
Region merging algorithms iteratively merge adjacent regions based on a certain merging cost criteria. Several region merging techniques are discussed in D. J. Robinson, N. J. Redding and D. J Crisp, “Implementation of a fast algorithm for segmenting SAR imagery,” Scientific and Technical Report, Defense Science and Technology Organization, Australia, January 2002 (hereinafter Robinson), as well as in U.S. Pat. Nos. 5,787,194, 6,832,002 and 6,895,115 (all of which are incorporated herein by reference). Different algorithms differ in different merging criteria and controlling schemes of merging sequence. In Robinson, the implementation of a region growing algorithm (Full λ-schedule algorithm) was described as “the fastest possible implementation” and the computational complexity is of order O(n log2 n), where n is the number of image pixels. Some known issues of region growing include: i) Segmentation results are sensitive to the merging sequence; ii) Termination criterion is usually some similarity measure threshold or number of iterations or output regions. It is very difficult to find the right value to get a satisfactory result; iii) High computation complexity if segmentation starts from individual pixels.
Another category of segmentation is based on edge detection and linking, as is described in R. Nevatia and K. R. Babu, “Linear feature extraction and description,” Computer Graphics and Image Processing, Vol. 13, pp. 257-269, 1980. The technique is based on edge detection followed by linking broken edges. However, the edge linking process can have serious difficulties in producing connected, one-pixel wide contours.
Another paradigm for gradient-based segmentation is based on morphological watershed transform. The watershed segmentation works to detect catchment basins as regions and crest lines as boundaries for these regions. One such technique is described in L. Vincent and P. Soille, “Watershed in digital spaces: an efficient algorithm based on immersion simulations,” IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 13, No. 6, pp. 583-598, 1991, which is incorporated herein by reference. One advantage of this algorithm is that object edges obtained by calculating gradient watershed boundaries are always guaranteed to be connected and closed, compared with edge detection and linking solutions.
Over-segmentation is a well-known problem of watershed segmentation. One way to solve the problem is to merge adjacent similar regions iteratively, as described in K. Haris, et al., “Hybrid image segmentation using watershed and fast region merging,” IEEE Trans. Image Processing, Vol. 7, No. 12, pp. 1684-1699, 1998; and L. Shafarenko, M. Petrou and M. Kittler, “Automatic watershed segmentation of randomly textured color images,” IEEE Trans. Image Processing, Vol. 6, No. 11, pp. 1530-1544, 1997, both of which are incorporated herein by reference. Just like the segmentation based on region merging and growing, it is very difficult to control the termination criteria of region merging step.
Another way to deal with over-segmentation is to build a watershed hierarchy using different scale spaces as described in P. T. Jackway, “Gradient watersheds in morphological scale-space,” IEEE Trans. Image Processing, Vol. 5, No. 6, pp. 913-921, 1996 (hereinafter Jackway); and J. M. Gauch, “Image segmentation and analysis via multiscale gradient watershed hierarchies,” IEEE Trans. Image Processing, Vol. 8, No. 1, pp. 69-79, 1999 (hereinafter Gauch), both of which are incorporated herein by reference.
Morphological scale space was used in Jackway, and Gaussian linear scale space was used in Gauch. Gaussian filtering or morphological operations with different scale parameters were applied to the original image. Since the original image was filtered to different degrees, the boundaries of segments would not meet at the edges in the original image. The paths of intensity extremes in the scale space must be followed as filtering of different levels proceeds. There are two relatively computationally complex steps involved in the above approaches: (i) building a scale space by applying Gaussian filtering or morphological operations with different scale parameters and (ii) linking intensity extremes from one scale level to the next (since the watershed lines move spatially with varying scale). Furthermore, the computational complexity worsens when the kernel increases quadratically with increasing scale parameters. Due to intensive computational overhead of the above existing approaches, the number of selected scale levels is usually limited to a small number.
Previous literature seldom goes through the segmentation for large images. This is usually a problem with limited computer memory, especially when dealing with large remote sensing images. An additional problem of previous segmentation algorithms is that often it is very difficult to pick the right segmentation parameters for the specific application. It may take days or hours to run segmentation with a particular set of parameters, only to find that an unsatisfactory result is obtained. Thus, a user was often left with a tedious and time-consuming trial-and-error process to determine the right parameters for the segmentation.
Previous work also often described segmentation algorithms with single-band or color images. There is no standardized method to segment multispectral or hyperspectral images which are widely used in remote sensing society and industry.