The present invention relates to systems and methods for processing images. More specifically, the present invention relates to systems and methods for effecting automatic image retrieval.
1. Description of the Related Art
Image-based document retrieval is required for a variety of consumer, commercial and government applications. Originally, images were retrieved manually. However, as image databases became larger, automated image retrieval systems were developed to accelerate the search and retrieval process.
One conventional automated approach involves the association of certain keywords with each image in a database. Images are then retrieved by a keyword search. However, this system suffers from the time intensive process of keyword input for large databases. In addition, the approach is highly dependent on the somewhat subjective manual assignment of keywords for each image and for the search itself. Finally, there is a limit with respect to the extent to which an image can be described adequately to allow for effective searching.
Another approach is that of automatic CBIR (content-based image retrieval). This system involves an analysis of each stored image with respect to its content (in terms of color, texture, shape, etc.). For example, the color content is stored in a histogram. In the search and retrieval process, the histogram from a query image is compared to the stored histogram data to find a best match. However, this system does not take into account spatial distribution of the color data.
The most often used approach to searching a database to select/retrieve images that are similar to a query is to compare the query image with the images in the database using their feature-based representation by means of distance functions. (See U.S. Pat. No. 5,579,471, entitled xe2x80x9cImage Query System and Method,xe2x80x9d issued Nov. 26, 1996 to R. J. Barber et al.; U.S. Pat. No. 5,852,823, entitled xe2x80x9cAutomatic Image Classification and Retrieval System From Database Using Query-By-Example Paradigm,xe2x80x9d issued Dec. 22, 1998 to J. S. De Bonet; xe2x80x9cColor Indexingxe2x80x9d, published in Intl. Journal, of Computer Vision, by M. J. Swain and D. H. Ballard, Vol. 7, No. 1, 1991, pp. 11-32; and xe2x80x9cComparing Images Using Color Coherence Vectors,xe2x80x9d published by G. Pass, et al., in Proceedings ACM Multimedia Conf., (1996).
These techniques represent an image in terms of its depictive features, such as color or texture. Given a query image Q, its feature-based representation is compared against the representation of every image I in the database to compute the similarity of Q and I. The images in the database are then ranked in decreasing order of their similarity with respect to the query image to form the response to the query. A key shortcoming of these techniques is that no distinction is made between perceptually significant and insignificant image features in the image representation and matching schemes.
In general, a human observer determines the content-based similarity of two images primarily on the basis of the perceptually significant contents of the image and not the finer details. By mimicking this behavior, a similarity retrieval system might produce results that are in more agreement with human interpretation of similarity. However, this fact has not been exploited by any of the above mentioned techniques.
In a copending U.S. Patent Application entitled xe2x80x9cPerceptually Significant Feature-based Image Archival and Retrieval,xe2x80x9d U.S. Ser. No. filed 14 Apr. 1999 by Wei Zhu and Rajiv Mehrotra, the teachings of which are incorporated herein by reference, Zhu et al. attempt to overcome the above-mentioned shortcoming by representing an image in terms of its perceptually significant features. Thus, similarity of two images becomes a function of the similarity of their perceptually significant features.
However, in this approach, image features are extracted from the properties of the entire image. There is no flexibility in computing image features or comparing image similarities based on main subject or background regions. As a result, more targeted searches, such as finding images with similar main subjects but dissimilar backgrounds as the query, cannot be performed.
Recently, U.S. Pat. No. 6,038,365, entitled xe2x80x9cImage Retrieval-Oriented Processing Apparatus Which Generates and Displays Search Image Data That Is Used As Index,xe2x80x9d was issued to T. Yamagami on Mar. 14, 2000. An image processing apparatus according to this invention includes a designating unit for designating an image area to be used as a retrieval image from a recorded image recorded in a recording medium, a storing unit for storing image area data representing the image area designated by the designating unit in connection with the corresponding recorded image, and a displaying unit for displaying, as the retrieval image, an image of the image area on the basis of the corresponding image area data stored in the storing unit.
Further, an image processing apparatus according to Yamagami""s invention includes a designating unit for designating an image area from an original image constituting a screen as a retrieval image, a storing unit for storing the retrieval image designated by the designating unit in connection with the corresponding original image, a displaying unit for displaying the retrieval image designated by the designating unit, an instructing unit for instructing the retrieval image displayed by the displaying unit, and a display control unit for displaying, on the displaying unit, the original image corresponding to the retrieval image instructed by the instructing unit.
Hence, Yamagami appears to disclose use of a selected area of an image for image retrieval. However, the selection is done manually using a designating unit. Further, the use of the selected area is motivated by an image reduction problem that makes characters too small to read. Since image data can generally be recognized only when a human being looks at them, when image data are reproduced, a list of a plurality of reduced images may generally be displayed so that the user can check the contents of image files, using the reduced images themselves as the retrieval images. However, in retrieval display of reduced images, since an entire image is simply reduced to, for example, one eighth in both its longitudinal and lateral dimensions, the reduced image may be too small to be recognized easily, making the use of that reduced image as a retrieval image impossible.
Consequently, Yamagami does not teach an automatic, general-purpose image retrieval apparatus. Nor is Yamagami""s invention built upon an automatic scene-content analysis scheme. Accordingly, a need remains in the art for a more accurate system or method for automatically retrieving images from a database.
The need in the art is addressed by the system and method for determining image similarity of the present invention. The inventive method includes the steps of automatically providing perceptually significant features of main subject or background of a first image; automatically providing perceptually significant features of main subject or background of a second image; automatically comparing the perceptually significant features of the main subject or the background of the first image to the main subject or the background of the second image; and providing an output in response thereto.
In the illustrative implementation, the features are provided by a number of belief levels, where the number of belief levels are preferably greater than two. In the illustrative embodiment, the step of automatically providing perceptually significant features of the main subject or background of the first image includes the steps of automatically identifying main subject or background of the first image and the step of identifying perceptually significant features of the main subject or the background of the first image. Further, the step of automatically providing perceptually significant features of the main subject or background of the second image includes the steps of automatically identifying main subject or background of the second image and the step of identifying perceptually significant features of the main subject or the background of the second image.
The perceptually significant features may include color, texture and/or shape. In the preferred embodiment, the main subject is indicated by a continuously valued belief map. The belief values of the main subject are determined by segmenting the image into regions of homogenous color and texture, computing at least one structure feature and at least one semantic feature for each region, and computing a belief value for all the pixels in the region using a Bayes net to combine the features.
In an illustrative application, the inventive method is implemented in an image retrieval system. In this implementation, the inventive method automatically stores perceptually significant features of the main subject or background of a plurality of first images in a database to facilitate retrieval of a target image in response to an input or query image. Features corresponding to each of the plurality of stored images are automatically sequentially compared to similar features of the query image. Consequently, the present invention provides an automatic system and method for controlling the feature extraction, representation, and feature-based similarity retrieval strategies of a content-based image archival and retrieval system based on an analysis of main subject and background derived from a continuously valued main subject belief map.