The following relates to the information processing arts. The following is described with illustrative reference to image retrieval and categorization applications, but will be useful in numerous other applications entailing comparison, retrieval, categorization, or the like of objects such as images, video content, audio content, and so forth.
In data processing, some useful operations include automatic or semi-automatic object categorization, or automated or semi-automated retrieval of similar objects. For example, given an unorganized database of images, it may be useful to sort or categorize the images into classes such as images of people, images of animals, images of plants, landscape images, or so forth. In a related application, given a particular image it may be useful to identify and retrieve similar images from the database of images.
It is known to model images using parameterized models. A Gaussian model, for example, characterizes an image using a single Gaussian distribution representative of low level image features and having a mean vector and covariance matrix parameters. Advantageously, characterizing the image by a single Gaussian component provides for straightforward comparison of different images, for example by comparing the mean vectors and covariance matrices of the two image models. However, a distribution having a single Gaussian component contains limited descriptive content and may be insufficient to adequately describe some (or even most) images.
The Gaussian model is an example of a single component model. Limitation to a single component raises substantial concerns about descriptive adequacy. In other approaches, a mixture model is employed to characterize an image. For example, a Gaussian mixture model (GMM) describes the low level features distribution for an image using a weighted combination of Gaussian components each having mean vector and covariance matrix parameters. Advantageously, a GMM or other mixture model provides a higher number of components by which to characterize the image.
However, use of mixture models introduces other difficulties. The higher dimensionality of the mixture model can make it difficult to robustly determine the model parameters for a given image. For example, a few hundred low level feature vectors may be extracted from a given image. Such a sparse data set may be insufficient to accurately estimate the relatively large number of mixture parameters. As a result, one may obtain the same quality of model “fit” for very different sets of mixture model parameters, making it difficult to ascertain the “correct” mixture model parameters.
Uncertainty in the model parameters determination carries over into the image comparison stage. For example, two images that are in reality quite similar may however be fitted with very different sets of mixture model parameters, due to sparseness of the feature vectors sets extracted from the images. In such a case, the computed distance between the mixture models for the two images will be large, and the images will erroneously be deemed to be quite different.
In addition to this robustness problem, the use of mixture models can make image comparison computationally intensive. For example, in some studies it has been estimated that a GMM having about 128 Gaussian components is preferable to sufficiently characterize an image. A comparison of two images will entail pairwise comparison of each set of Gaussian components, leading to about 16,000 Gaussian comparison operations for a GMM having 128 Gaussian components. Further, these approximately 16,000 Gaussian comparison operations are repeated for each pairwise image comparison involved in the classification or similar image retrieval task. Such extensive computations can be problematic or even prohibitive when operating on a large images database.