The interest in digital media has grown to new heights with rapid technological advancements being made in the capture and sharing of images, consequently necessitating the exploration of methods to enhance, classify and/or extract information from them. Image segmentation is one approach that provides the foundation to make these functionalities ever more effective and expeditious. Segmentation is defined as the meaningful partitioning of an image into distinct clusters that exhibit homogeneous characteristics. In doing so, it generates a reduced and relevant dataset for high level semantic operations such as rendering, indexing, classification, compression, content-based retrieval, and multimedia applications, to name a few. Though segmentation comes naturally to human observers, the development of a simulated environment to perform this imaging task has proven to be extremely challenging.
Many grayscale/color domain methodologies have been adopted in the past to tackle this ill-defined problem (see Lucchese et al., “Color Image Segmentation: A State of the Art Survey,” Proc. Indian National Science Acad. 67(2):207-221 (2001); Cheng et al., “Color Image Segmentation: Advances & Prospects,” Pat. Rec. 34(12):2259-2281 (2001), which are hereby incorporated by reference in their entirety, for comprehensive surveys). Initial multiscale research was aimed to overcome drawbacks being faced by Bayesian approaches for segmentation/classification, using Markov Random Fields (MRF's) and Gibbs Random Field's (GRF's) estimation techniques. Derin et al., “Modeling and Segmentation of Noisy and Textured Images Using Gibbs Random Fields,” IEEE Trans. on Pat. Anal. and Mach. Int 9(1):39-55 (1987), which is hereby incorporated by reference in its entirety, proposed a method of segmenting images by comparing the Gibbs distribution results to a predefined set of textures using a Maximum a posteriori (MAP) criterion. Pappas et al. “An Adaptive Clustering Method For Image Segmentation,” IEEE Trans. on Sig. Process. 40(4):901-914 (1992), which is hereby incorporated by reference in its entirety, generalized the k-means clustering method using adaptive and spatial constraints, and the Gibbs Random Field (GRF) model to achieve segmentation in the gray scale domain. Chang et al. “Adaptive Bayesian Segmentation of Color Images,” Jour. of Elec. Imag. 3(4):404-414 (1994), which is hereby incorporated by reference in its entirety, extended this to color images by assuming conditional independence of each color channel. Improved segmentation and edge linking was achieved by Saber et al. “Fusion of Color and Edge Information For Improved Segmentation and Edge Linking,” Imag. and Vision Comp. 15:769-780 (1997), which is hereby incorporated by reference in its entirety, who combined spatial edge information and the regions resulting from a GRF model of the segmentation field. Bouman et al. “Multiple Resolution Segmentation of Textured Images,” IEEE Trans. on Pat. Anal. and Mach. Int 7(1):39-55 (1991), which is hereby incorporated by reference in its entirety, proposed an method for segmenting textured images comprising regions with varied statistical profiles using a causal Gaussian autoregressive model and a MRF representing the classification of each pixel at various scales. However most of the aforementioned methods suffered from the fact that the obtained estimates could not be calculated exactly and were computationally prohibitive. To overcome these problems, Bouman et al. “A Multiscale Random Field Model For Bayesian Image Segmentation” IEEE Transactions on Image Processing 3(2):1454-1466 (1994), which is hereby incorporated by reference in its entirety, extended his work by incorporating a multiscale random field model (MSRF) and a sequential MAP (SMAP) estimator. The MSRF model was used to capture the characteristics of image behavior at various scales. However, the work in Bouman et al., “Multiple Resolution Segmentation of Textured Images,” IEEE Trans. on Pat. Anal. and Mach. Int 7(1):39-55 (1991); and Bouman et al., “A Multiscale Random Field Model For Bayesian Image Segmentation” IEEE Transactions on Image Processing 3(2):1454-1466 (1994), which are hereby incorporated by reference in their entirety, had either used single scale versions of the input image, or multiscale versions of the image with the underlying hypothesis that the random variables at a given level of the image data pyramid were independent from the ones at other levels.
Comer et al. “Multiresolution Image Segmentation,” IEEE International Conference on Acoustics Speech and Signal Processing (1995), which is hereby incorporated by reference in its entirety, used a multiresolution Gaussian autoregressive model (MGAR) for a pyramid representation of the input image and “maximization of posterior marginals” (MPM) for pixel label estimates. He established correlations for these estimates at different levels using the interim segmentations corresponding to each level. He extended his work in Comer et al., “Segmentation of Textured Images Using a Multiresolution Gaussian Autoregressive Model,” IEEE Transactions on Image Processing 8(3):1454-1466 (1999), which is hereby incorporated by reference in its entirety, by using a multiresolution MPM model for class estimates and a multiscale MRF to establish interlevel correlations into the class pyramid model. Liu et al. “Multiresolution Color Image Segmentation,” IEEE Transactions on Image Processing 16(7):1454-1466 (1994), which is hereby incorporated by reference in its entirety, proposed a relaxation process that converged to a MAP estimate of the eventual segmentation of the input image using MRF's in a quadtree structure. An MRF model in combination with the discrete wavelet transform was proposed by Tab et al. “Scalable Multiresolution Color Image Segmentation,” Signal Processing 86:1670-1687 (2006), which is hereby incorporated by reference in its entirety, for effective segmentations with spatial scalability, producing similar patterns at different resolutions. Cheng et al. in International Conference on Image Processing (1998), which is hereby incorporated by reference in its entirety, incorporated Hidden Markov Models (HMM's) for developing complex contextual structure, capturing textural information, and correlating among image features at different scales unlike previously mentioned MRF models. The method's usefulness was illustrated on the problem of document segmentation where intra scale contextual dependencies can be imperative. A similar principle was applied by Won et al. (Won et al., “Hidden Markov Multiresolution Texture Segmentation Using Complex Wavelets,” in International Conference on Telecommunications, which is hereby incorporated by reference in its entirety), who combined HMM and Hidden Markov Tree (HMT) forming a hybrid HMM-HMT model to establish local and global correlations for efficient block-based segmentations.
Watershed and wavelet-driven segmentation methods has been of interest for many researchers. Vanhamel et al. “Multiscale Gradient Watersheds of Color Images,” IEEE Transactions on Image Processing 12(6):1454-1466 (2003), which is hereby incorporated by reference in its entirety, proposed a scheme constituting a non-linear anisotropic scale space and vector value gradient watersheds in a hierarchical frame work for multiresolution analysis. In a similar framework Makrogiannis et al. “Watershed-Based Multiscale Segmentation Method For Color Images Using Automated Scale Selection,” J. Electronic Imaging 14(3) (2005), which is hereby incorporated by reference in its entirety, proposed watershed based segmentations utilizing a fuzzy dissimilarity measure and connectivity graphs for region merging. Jung et al. “Combining Wavelets and Watersheds For Robust Multiscale Image Segmentation,” Image And Vision Computing 25:24-33 (2007), which is hereby incorporated by reference in its entirety, combined orthogonal wavelet decomposition with the watershed transform for multiscale image segmentation.
Edge, contour and region structure are other features that have been adopted in various approaches for effective segmentations. Tabb et al. “Multiscale Image Segmentation by Integrated Edge and Region Detection,” IEEE Transactions on Image Processing 6(5) (1997), which is hereby incorporated by reference in its entirety, instituted a multiscale approach where the concept of scale represented image structures at different resolutions rather than the image itself. The work involved performing a Gestalt analysis facilitating detection of edges and regions without any smoothing required at lower scales. On the other hand, Gui et al. “Multiscale Image Segmentation Using Active Contours,” which is hereby incorporated by reference in its entirety, obtained multiscale representations of the image using weighted TV flow and used active contours for segmentation. The contours at one level were given as input to the next higher level to refine the segmentation outcome at that level. Munoz et al. “Unsupervised active Regions For Multiresolution Image Segmentation,” IEEE International Conference On Pattern Recognition (2002), which is hereby incorporated by reference in its entirety, applied fusion of region and boundary information, where the later was used for initializing a set of active regions which in turn would compete for pixels in the image in manner that would eventually minimize a region-boundary based energy function. Sumengen et al. “Multi-Scale Edge Detection and Image Segmentation,” (2005), which is hereby incorporated by reference in its entirety, showed through his work that multiscale approaches are very effective for edge detection and segmentation of natural images. Mean shift clustering followed by a minimum description length (MDL) criterion was used by Luo et al. “Unsupervised Multiscale Color Image Segmentation Based on MDL Principle,” IEEE Transactions on Image Processing 15(9):1454-1466 (2006), which is hereby incorporated by reference in its entirety, for the same purpose.
Fusion of color and texture information is an eminent methodology in multiresolution image understanding/analysis research. Deng et al. “Unsupervised Segmentation of Color-Texture Regions in Images and Video,” IEEE Transactions on Pattern Analysis and Machine Intelligence 23(8):800-810 (2001), which is hereby incorporated by reference in its entirety, proposed a method prominently known as JSEG that performed color quantization and spatial segmentation in combination of a multiscale growth procedure for segmenting color-texture regions in images and video. Pappas et al. (Chen and Pappas, “Perceptually Tuned Multi-Scale Color Texture Segmentation,” in IEEE International Conference on Image Processing (2004), which is hereby incorporated by reference in its entirety,) utilized spatially adaptive features pertaining to color and texture in a multiresolution structure to develop perceptually tuned segmentations, validated using photographic targets. Dominant color and homogenous texture features (HTF) integrated with an adaptive region merging technique were employed by Wan et al. “Multi-Scale Color Texture Image Segmentation With Adaptive Region Merging,” IEEE International Conference on Acoustics Speech and Signal Processing (2007), which is hereby incorporated by reference in its entirety, to achieve multiscale color-texture segmentations.
The task of segmenting images in perceptually uniform color spaces is an ongoing area of research in image processing. Paschos et al. “Perceptually Uniform Color Spaces For Color Texture Analysis: An Empirical Evaluation,” IEEE Transactions on Image Processing 10(6):932-937 (2001), which is hereby incorporated by reference in its entirety, proposed an evaluation methodology for analyzing the performance of various color spaces for color texture analysis methods such as segmentation and classification. The work showed that uniform/approximately uniform color spaces such as L*a*b*, L*u*v* and HSV possess a performance advantage over RGB, a non uniform color space traditionally used for color representation. The use of these color spaces was found to be suited for the calculation of color difference using the Euclidean distance, employed in many segmentation methods. Yoon et al. “Color Image Segmentation Considering the Human Sensitivity For Color Pattern Variations,” SPIE Proceedings 4572:269-278 (2001), which is hereby incorporated by reference in its entirety, utilized this principle to propose a Color Complexity Measure (CCM) for generalizing the K-means clustering method, in the CIE L*a*b* space. Chen et al. “Contrast-Based Color Image Segmentation,” IEEE Signal Processing Letters, 11(7): 64 1-644 (2004), which is hereby incorporated by reference in its entirety, employed color difference in the CIE L*a*b* space to propose directional color contrast segmentations. Contrast generation as a function of the minimum and maximum value of the Euclidean distance in the CIE L*a*b* space, was seen in the work of Chang et al. “Color-Texture Segmentation of Medical Images Based on Local Contrast Information,” IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology pp. 488-493 (2007), which is hereby incorporated by reference in its entirety. This contrast map, subjected to noise removal and edge enhancement to generate an Improved Contrast Map (ICMap), was the proposed solution to the problem of over-segmentation in the JSEG method. More recently, Gao et al. “A Novel Multiresolution Color Image Segmentation Technique and its Application to Dermatoscopic Image Segmentation,” in IEEE International Conference on Image Processing (2000), which is hereby incorporated by reference in its entirety, introduced a ‘narrow-band’ scheme for multiresolution processing of images by utilizing the MRF expectations-maximization principle in the L*u*v* space. This technique was found to be competent especially for segmenting dermatoscopic images. Lefevre et al. “Multiresolution Color Image Segmentation Applied to Background Extraction in Outdoor Images,” IS&T European Conference on Color in Graphics, Image and Vision, Poitiers, France, pp. 363-367 (2002), which is hereby incorporated by reference in its entirety, performed multiresolution image segmentation in the HSV space, applied to the problem of background extraction in outdoor images.
Color gradient-based segmentation is a new contemporary methodology in the segmentation realm. Dynamic color gradient thresholding (DCGT) was first seen in the work by Balasubramanian et al. “Unsupervised Color Image Segmentation By Dynamic Color Gradient Thresholding” Proceedings of SPIE/IS&T: Electronic Imaging Symposium, San Jose, Calif. (2008), which is hereby incorporated by reference in its entirety. The DCGT technique was primarily used to guide the region growth procedure, laying emphasis on color homogenous and color transition regions without generating edges. However this method faced problems of over segmentation due to lack of a texture descriptor and proved to be computationally expensive. Garcia et al. “Automatic Color Image Segmentation By Dynamic Region Growth and Multimodal Merging of Color and Texture Information”, International Conference on Acoustics, Speech and Signal Processing, Las Vegas, Nev., (2008), which is hereby incorporated by reference in its entirety, proposed a segmentation method that was an enhanced version of the DCGT technique (abbreviated here as Gradient Segmentation (GS) algorithm) by incorporating an entropic texture descriptor and a multiresolution merging procedure. The method brought significant improvement in the segmentation quality and computational costs, but was not fast enough to meet real time practical applications.
There remains a need for segmentation methods that efficiently facilitate: 1) selective access and manipulation of individual content in images based on desired level of detail, 2) handling sub sampled versions of the input images and decently robust to scalability, 3) a good compromise between quality and speed, laying the foundation for fast and intelligent object/region based real-world applications of color imagery.