Content-based retrieval techniques for multimedia become important after international coding standards, such as JPEG, MPEG-1, MPEG -2, MPEG -4, have been finalized and widely distributed over the internet. The international standard for multimedia (MM) content description, MPEG-7, has been proposed to provide normal numerical descriptors as the matching criteria for similarity measurement in search engine. For describing image/video content in MPEG-7, the statistics of color, shape information and motion behavior in the content are defined. In general, searching for MM content is always guided by retrieving the desired visual information content. To better reflect human recognized information in visual descriptors, background information in multimedia content should be separated from the image object. Several methods have been proposed to identify background region in MM content.
In the disclosure, “Background removal in image indexing and retrieval”, International Conference Image Analysis and Processing, 1999, Lu and Hong Guo utilize fuzzy clustering technique for color image segmentation and, for each segmented region, size and adjacency with border are used to determine whether it belongs to background or not. Then the regions categorized into background are removed before feature extraction.
In the disclosure, “Image segmentation using hierarchical meshes”, IEEE International Conference Image Processing, pp. 6–10, 1999, D. K. Lim and Y. S. Ho use the hierarchical meshes to locate the object boundary in an image first, and then perform region growing based on the detected boundary points to yield the final image object. In the disclosure, “A hierarchical approach to color image segmentation using homogeneity”, IEEE Trans. Image Processing, pp. 2071–2082, vol. 9, no. 12, 2000, H. D. Cheng and Y. Sun also use hierarchical histogram to locate uniform regions for further region merging to generate the final segmented image. For natural image segmentation, a reduced set of regions are identified and proposed by using a K-means method based on local image statistics, i.e., mean and variance. The K-means method is disclosed in the art of “Segmentation approach using local image statistics”, Electronics letters, vol. 36, no. 14, pp. 1199–1201, 2000.
Color clustering is another approach for effective background removal and for facilitating the image retrieval. These related arts are disclosed in P. Felzenszwalb, D. Huttenlocher, “Image segmentation using local variation”, Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 98–104, 1998; D. Comaniciu et al, “Mean shift analysis and applications”, IEEE Conference Computer Vision, pp. 1197–1203, 1999; and Y. Deng et al, “Color image segmentation”, IEEE Conference on Computer Vision and Pattern Recognition, pp. 446–451, 1999. The later two arts cannot retrieve the visual object from the visual content.
The segmentation methods described above can be categorized into two approaches, i.e., clustering in the color space based on histogram vs. clustering in the Euclidean space based on homogeneity. The histogram based approach can cluster far-flung image pixels into the same region if they are close enough in color space. Because no spatial distances are adopted for this kind of clustering, it may result in noises in segmented regions. If spatial distances are adopted for the gray-level homogeneity, the sparse noises in segmented regions can be reduced by this kind of distance clustering. These two approaches are complementary.
Most image processing methods above divide two-dimensional images into blocks. Each block includes pixels that are considered as a basic image block for processing. The statistics of the pixels in an image block, such as mean and variance, are usually computed for either clustering or division. Methods are designed according to their specific application. For processing visual database, the contents are divergent and hence a comprehensive approach should be addressed.