1. Field of the Invention
This invention relates to a method and apparatus for filtering, clustering, and region fitting by a mean shift of images using kernel function values.
2. Description of the Related Art
One known clustering method of dividing samples scattered in an object space, such as an image space, into clusters having a uniform feature quantity mean shift is disclosed in D. Comaniciu, et al., “Mean Shift: A Robust Approach Toward Feature Space Analysis,” IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 24, no. 5, May 2002. Mean shift is the approach of replacing the feature quantity of a target sample with the average of the feature quantities of the individual samples in the kernel. In this document, a circle with a specific radius having a target sample in the center and a spherical or oval fixed kernel are used. Mean shift is realized by estimating a probability density function for a plurality of samples scattered in an object space (which is generally referred to as kernel density estimation) and finding a local maximum position of the gradient of the estimated probability density function.
Kernel density estimation has been described in, for example, B. W. Silverman, “Density Estimation for Statistics and Data Analysis,” Chapman & Hall/CRC, ISBN 0-412-24620-1, 1986, pp. 13-15 and pp. 75-76. An example of the kernel function has been shown on p. 43 of the document. A kernel function and a profile act as weighting values for each sample in kernel density estimation or mean shift.
Clustering can be realized by allocating a cluster to each local maximum position of the probability density function obtained by kernel density estimation. Specifically, mean shift is repeated using each sample in an object space as a target sample, thereby causing the feature quantity of each sample in a cluster to converge at a typical feature quantity in the cluster. After the convergence, a cluster is allocated to each part having a single feature quantity. For an object space where a suitable distance or a parameter conforming to the distance has been defined, such clustering can be possible. When the object space is an image space, the operation of applying clustering to positions and feature quantities (e.g., luminance vectors or color signal value vectors, such as RGB) is also referred to as region segmentation.
In mean shift, a target sample is moved to the mean position of the feature quantities. If the samples in the object space are distributed uniformly, such a movement does not take place. When the object space is a three-dimensional image space defined by two-dimensional positions (x-coordinate and y-coordinate) and a one-dimensional feature quantity (luminance), clustering is possible by mean shift, provided that each pixel is regarded as a sample in the image space. At this time, if there is a single region composed of a plurality of such partial regions as each allow a plurality of fixed kernels to fit in, for example, the image, a convergent point is created for each partial region in mean shift using a fixed kernel. As a result, a plurality of clusters are allocated to a single region (which is referred to as excessive segmentation). This goes against the clustering's purpose of allocating one cluster to a region that can be regarded as a single. If the radius of the kernel is made larger, a plurality of clusters are less liable to be allocated to a single region.
Consider a case where there are a plurality of sample sets in the object space and clusters are expected to be allocated to the individual sample sets. At this time, if the radius of the kernel is small, the samples in each sample set converge in one place as a result of mean shift and a cluster is allocated to each set. However, when the radius of the kernel is too large, all of the samples converge in one place by mean shift. As a result, a plurality of clusters to be divided are integrated into a unity (which referred to as excessive integration).
More specifically, consider a case where, for example, a pattern with the correct outline is prepared for image retrieval and a region boundary which coincides with an edge pattern is searched for by generalized Hough transform (refer to, for example, D. H. Ballard, Generalizing the Hough Transform to Detect Arbitrary Shapes, Pattern Recognition 1981, vol. 13, no. 2, pp. 111-122). At this time, the following problem arises: if the region boundaries are lost (excessive integration), the detection of the object fails and, if there is an unnecessary region boundary (excessive segmentation), a thing that is not an object is detected erroneously.