This invention relates generally to the field of image processing and retrieval and, more specifically, to the search and retrieval of relevant images from a database of images.
As more and more information is available electronically, the efficient search and retrieval of relevant information from vast databases becomes a challenging problem. Given an image database, selection of images that are similar to a given query (or example) image is an important problem in content-based image database management. There are two main issues of concern in the design of a technique for image similarity-based retrieval: image representation, and image similarity. Image representation is concerned with the content-based representation of images. Given a content-based image representation scheme, image similarity is concerned with the determination of similarity/dissimilarity of two images using a similarity measure based on that representation. Both image content and image similarity are very subjective in nature. User preference/subjectivity in a multimedia retrieval system is important because for a given image, the contents of interest and the relative importance of different image contents are application/viewer dependent. Even for a single viewer or application, the interpretation of an image""s content may vary from one query to the next. Therefore, a successful content similarity-based image retrieval system should capture the preferences/subjectivity of each viewer/application and generate responses that are in accordance with the preferences/subjectivity.
Almost all existing commercial and academic image indexing and retrieval systems represent an image in terms of its low-level features such as color and texture properties, and image similarity is measured in the form of:
S(I,J)=xcexa3iwi*DFi(I,J),i=1, . . . , N
where S(I, J) is a function which measures the overall image similarity between images I and J, each image is represented in terms of N features, Fi, i={1, . . . , N}, DFi(I, J) is a function for computing the similarity/difference between image I and J based on the feature Fi, and wi is the weight, or importance, of feature Fi in the overall image similarity decision [see W. Niblack, R. Barber, W. Equitz, M. Flickner, E. Glasman, D. Petkovic, P. Yanker, D. Faloutsos, and G. Taubin, xe2x80x9cThe QBIC Project: Querying Images By Content Using Color Texture, and Shapexe2x80x9d, SPIE Vol. 1908, 1993, pp173-187; U.S. Pat. No. 5,579,471, R. J. Barber et al., xe2x80x9cImage Query System and Methodxe2x80x9d, 1996; M. Stricker and M. Orengo, xe2x80x9cSimilarity of Color Imagesxe2x80x9d, SPIE Vol. 2420, 1995; S. Santini and R. Jain, xe2x80x9cSimilarity Queries in Image Databasesxe2x80x9d, CVPR, 1996; J. Smith, and S. Chang, xe2x80x9cTools and Techniques for Color Image Retrievalxe2x80x9d, SPIE Vol. 2670, 1996; U.S. Pat. No. 5,652,881, M. Takahashi, K. Yanagi, and N. Iwai, xe2x80x9cStill Picture Search/Retrieval Method Carried Out on the Basis of Color Information and System For Carrying Out the Samexe2x80x9d, 1997; J. K. Wu, A. D. Narasimhalu, B. M. Mehtre, C. P. Lam, Y. J. Gao, xe2x80x9cCORE: a content-based Retrieval Engine for Multimedia Information Systemsxe2x80x9d, Multimedia Systems, Vol. 3, 1996, pp25-41; W. Y. Ma, xe2x80x9cNETRA: A Toolbox for Navigating Large Image Databasesxe2x80x9d, Ph.D. Dissertation, UCSB, 1997; Y. Rui, S. Mehrotra, and M. Ortega, xe2x80x9cA Relevance Feedback Architecture for Content-based Multimedia Information Retrieval Systemsxe2x80x9d, IEEE Workshop on Content-based Access of Image and Video Libraries, 1997, pp82-89.].
In most of the systems cited above, either the weight wi for each feature Fi is fixed, or the user manually provides the value to indicate his/her preferences regarding the relative importance of that feature. To a normal user, the different features and weights generally do not intuitively correlate to his/her interpretation of the query image and the desired query results. For example, some systems require the user to specify the relative importance of features such as color, texture, structure, and composition for processing a query. To an average user, what is meant by these different features, and what weights to assign to each one in order to obtain desired results is definitely unclear. The optimum combination of weights to use for a specific query toward a specific goal is highly dependent on the image description scheme and the similarity measure used by the system, and is not readily understood by the average user.
Recently, a few approaches have been proposed in order to overcome the above mentioned problems. These approaches require the user to identify a few relevant images from the query response. The set of relevant images is processed to automatically determine user preferences regarding the relative importance of different features or the preferred distance measure. One such approach was proposed in [Y. Rui, S. Mehrotra, and M. Ortega, xe2x80x9cA Relevance Feedback Architecture for Content-based Multimedia Information Retrieval Systemsxe2x80x9d, IEEE Workshop on Content-based Access of Image and Video Libraries, 1997, pp82-89]. Given a query, multiple ranked response sets are generated using a variety of representations and associated similarity measures. The default response set to the query is displayed. The user selects a few relevant images and provides their ranking. The response set that best matches the set of ranked relevant images is then selected as the preference-based query response. The major shortcomings of this approach are: (i) user preference cannot be specified without first processing a query; (ii) for a small set of ranked relevant images, the final response set may not be unique; and (iii) relative importance of individual components of a image representation cannot be modified based on the set of relevant images.
Another approach to user preference-based query processing is to arrange all database images in a virtual feature space. Users can xe2x80x9csift throughxe2x80x9d the different subsets of feature spaces and identify the desirable feature set for query processing [A. Gupta, S. Santini, R. Jain, xe2x80x9cIn Search of Information in Visual Mediaxe2x80x9d, Communications of the ACM, Vol. 40, No. 12, December, 1997, pp35-42]. While this approach alleviates the burden of fixed feature weights, the lack of correlation between user interpretation of image similarity and associated feature-based representation still remains. Furthermore, since complete global image representations are used for images with multiple subjects or regions of interest, a subset of feature space that corresponds to user preference may not exist.
A natural approach to capture a user""s preferences for image similarity computation is to automatically extract/derive such information from the user supplied positive examples and negative examples of desired images. The derived preferences can then be used to automatically determine similarity measures. An existing system, xe2x80x9csociety of modelsxe2x80x9d, of Minka and Picard [T. P. Minka and R. W. Picard, xe2x80x9cInteractive Learning with a Society of Modelsxe2x80x9d, Pattern Recognition, Vol. 30, 1997, pp565-581], adopts this approach. The system employs a variety of feature-based image representations and associated similarity measures to generate several different similarity-based hierarchical clusters of images in a database. The user supplied positive and negative examples are used to identify the image clusters preferred by the user. All images in the preferred clusters form the set of images desired by the user. The image clusters can be dynamically adapted based on user supplied examples of desired or undesired images. This process of modifying clusters is very time consuming for large databases. Another system called NETRA [W. Y. Ma, xe2x80x9cNETRA: A Toolbox for Navigating Large Image Databasesxe2x80x9d, Ph.D. Dissertation, UCSB, 1997] also utilizes feature similarity-based image clusters to generate user preference-based query response. This system has restricted feature-based representations and clustering schemes. The main drawbacks of both these system are (i) the database is required to be static (i.e., images cannot be dynamically added or deleted from the database without complete database re-clustering); (ii) usually, a large number of positive and negative examples need to be provided by the user in order for the system to determine image clusters that correspond to the set of desired images.
The present invention proposes a general framework or system for user preference-based query processing. This framework overcomes the shortcomings of the existing approaches to capture and utilize user preferences for image retrieval.
An object of this invention is to provide a generalized user-friendly scheme to automatically determine user preferences for desired images and to perform preference-based image retrieval.
A second object is to provide a system for user preference-based image retrieval from a dynamic database of images. That is, images can be added/deleted from the database dynamically without requiring complete database reorganization.
A third object is to provide an approach to efficiently determine the relative importance of individual components of the image representation scheme from user supplied examples and counterexamples.
These and other objects will become clear in the following discussion of the preferred embodiment.
Briefly summarized, according to one aspect of the present invention, the invention resides in a method for learning a user preference for a desired image, the method comprising the steps of: using either one or more examples or counterexamples of a desired image for defining a user preference; extracting a relative preference of a user for either one or more image components or one or more depictive features from the examples and/or counterexamples of desired images; and formulating a user subjective definition of a desired image using the relative preferences for either image components or depictive features.