This invention relates generally to data management, and, more particularly to retrieving specific portions of images using color correlograms.
With the rapid proliferation of the Internet and the World-wide Web, the amount of digital image data accessible to users has grown enormously. Image databases are becoming larger and more widespread, and there is a growing need for effective and efficient image retrieval systems. Image retrieval systems are systems that extract from a large collection of images ones that are xe2x80x9csimilarxe2x80x9d to an image of interest to the user. Most existing image retrieval systems adopt the following two-step approach to search image databases: (i) indexing: for each image in the database, a feature vector capturing certain essential properties of the image is computed and stored in a featurebase, and (ii) searching: given a query image, its feature vector is computed, compared to the feature vectors in the featurebase, and images most similar to the query image are returned to the user.
For a retrieval system to be successful, the feature defined for an image should have certain desirable qualities: (i) the difference between pre-selected features of two images should be large if and only if the images are not xe2x80x9csimilarxe2x80x9d, (ii) the feature should be fast to compute, and (iii) the size of the feature should be small.
While most image retrieval systems retrieve images based on overall image comparison, users are typically interested in target searching such as in a database of images or in video browsing. In target searching, the user specifies a subregion (usually an interesting object) of an image as a query. For example, a user might wish to find pictures in which a given object appears, or scenes in a video with a given appearance of a person. In response to the user""s query, the system should then retrieve images containing this subregion, or object from the database. This task, called image subregion querying, is made challenging by the wide variety of effects, such as different viewing positions, camera noise and variation, and object occlusion, that cause the same object to have a different appearance in different images.
Color histograms are commonly used as feature vectors for image retrieval and for detecting cuts in video processing because histograms are efficient to compute and insensitive to camera motions. Histograms are not robust to local changes in images, so false positives easily occur using histograms. Though the histogram is easy to compute and seemingly effective, it is liable to cause false positive matches, especially where databases are large, and is not robust to large appearance changes. Another disadvantage of the color histogram is insensitivity to illumination changes. Recently, several approaches have attempted to improve upon the histogram by incorporating spatial information with color. Many of these methods are still unable to handle large changes in appearance. For instance, the color coherence vector (CCV) method uses the image feature(s), e.g. spatial coherence of colors and pixel position, to refine the histogram. These additional features improve performance, but also require increased storage and computation time.
The image subregion retrieval system should also be able to solve the location problem, i.e. the system should be able to find the location of the object in the image. The location problem arises in tasks such as real-time object tracking and video searching, where it is necessary to localize the position of an object in a sequence of frames.
Template matching is one approach used to solve the location problem. This method generally yields good results but is computationally expensive. A refined form of template matching is the histogram backprojection method. The method of histogram backprojection is to first compute a xe2x80x9cgoodness valuexe2x80x9d for each pixel in an image (the goodness of each pixel is the likelihood that this pixel is in the target) and then to obtain the subimage and therefore the location whose pixels have the highest goodness values. Histogram backprojection however gives the same goodness value to all pixels of the same color. The technique emphasizes colors that appear frequently in the image. This may result in overemphasizing certain colors in the object Q. If the image has a subimage that has many pixels of color c, then this method tends to identify Q with this subimage, even though the two objects may be unrelated, thus causing an error in some cases.
Another task requiring object retrieval from images is cut detection in video processing. Cut detection is the process of segmenting a video into different camera shots which allows the extraction of key frames for video parsing and querying.
A flexible tool for browsing video databases should also provide users with the capability to place object-level queries that have semantic content, such as xe2x80x9ctrack this person in a sequence of videoxe2x80x9d. To handle to queries, the system has to find which frames contain the specific object or person, and has to locate the object in those frames.
It remains desirable to have an efficient and accurate means of identifying and retrieving objects in images which allows for changes in the appearance of the image content such as viewing angle and magnification.
It is therefore an object of the present invention to provide a method and apparatus to perform efficient image comparisons in order to retrieve objects in images.
It is a further object of the present invention to provide a method and apparatus to provide to perform image comparisons for image subregion querying which allow for significant changes in the image such as viewing position, background, and focus.
It is another object of the present invention to provide a method and apparatus which enables efficient image subregion retrieval from a database.
The objects set forth above as well as further and other objects and advantages of the present invention are achieved by the embodiments of the invention described hereinbelow.
The problems of image retrieval are solved by the present invention of providing and using a color correlogram to query objects in images. The color correlogram of the present invention is a three-dimensional representation indexed by color and distance between pixels which expresses how the spatial correlation of color changes with distance in a stored image. The color correlogram includes spatial correlation of colors, combines both the global and local distributions of colors, is easy to compute, and is small from a data storage perspective. The color correlogram is robust in tolerating large changes in the appearance of a scene caused by changes in viewing positions, changes in the background scene, partial occlusions, and magnification that causes radical changes in shape.
To create a color correlogram, the colors in the image are quantized into m color values, c1 . . . cm. Also, the distance values kxcex5[d] to be used in the correlogram are determined where [d] is the set of distances between pixels in the image, and where dmax is the maximum distance measurement between pixels in the image. Each entry in the color correlogram is the probability of finding a pixel of color cj at a selected distance k from a pixel of color ci.
A color autocorrelogram, as provided in this invention, is a restricted version of the color correlogram that considers color pairs of the form (i,i) only.
The color correlogram may be used to query objects in images as well as entire images stored in a database. Extensions to the color correlogram may also be used in object retrieval tasks. The general theme behind the extensions are the improvement of storage efficiency of the correlogram without compromising the image discrimination capability of the correlogram and the use of additional information (such as an edge) to further refine the correlogram which improves image retrieval performance.
The correlogram intersection is used for image subregion querying. Using the correlogram intersection, the relative counts of color pairs in the images being compared are determined. The comparison easily eliminates the images which do not match.
The correlogram may also be used in locating objects in images. The location problem arises in tasks such as real-time object tracking or video searching, where it is necessary to localize the position of an object in a sequence of frames. Efficiency is also required in location because large amounts of data must be processed.
Any norm for comparing vectors, for example the standard L1 norm, may be used to compare color correlograms/color autocorrelograms.
Experimental evidence shows that the color correlogram outperforms not only color histograms but also more recent histogram refinements such as the color coherence vector method for image indexing and retrieval.
The present invention together with the above and other advantages may best be understood from the following detailed description of the embodiments of the invention illustrated in the drawings, wherein: