The present invention relates generally to database object indexing and retrieval and more specifically to content-based multimedia retrieval that is responsive to user relevance feedback.
Due to the rapidly growing amount of digital multimedia data available via the Internet or stored on private systems, there is a need for effective techniques for managing large multimedia databases. Content-searchable image database management systems often use an approach based on search-by-explicit queries, in which the user must provide some description of a desired object. Typically, the system retrieves objects, such as images, based on a similarity metric that correlates the description with various features associated with stored objects in the database. The metric might be calculated as a weighted sum of values for a set of low level image (i.e., attributes) such as color, shape, size, and texture patterns.
A concern with image retrieval from databases is the difficulty in establishing a correlation between the worded description and the low level image features which are utilized to organize database images. An image database management system is effective at retrieving an image if the description is specific with respect to one or more of the searchable image features, such as a search for xe2x80x9call square yellow objects.xe2x80x9d However, the system will be much less effective if the description is less specific to the searchable image features and more specific to the desired object, such as a search for xe2x80x9call yellow cars,xe2x80x9d because cars have many different shapes and sizes that are shared by other objects. Moreover, the same car can appear dramatically different depending on the vantage point from which the image was generated and depending on the lighting of the car. Although human perception is effective at interpreting two images of the same yellow car which are taken in different lighting and from a different perspective, current image databases perform the interpretation much less effectively.
U.S. Pat. No. 5,696,964 to Cox et al. describes a queryless multimedia database search method and system having a Bayesian inference engine which utilizes user relevance feedback to direct a search. The system maintains a probability distribution which represents a probability that each image in the database is the target of the search. The distribution is utilized to select the set of images to display to the user and further selections from the displayed images are solicited from the user. Each database image has a set of quantified features (i.e., attribute values) and the user indicates which of the selected images are similar to the target image. The selection of particular images having specific quantified features triggers an adjustment of the probability distribution. The adjusted probability distribution determines the next set of images which will be displayed to the user in a subsequent iteration of the search.
One of the problems of the Cox et al. database search system is that the system relies on features selected by a system operator or designer to describe and index the database images. The user feedback is utilized to modify the probability distribution only within the parameters defined by the features that have been quantified. Consequently, if a user is focused on a feature not included within the system-defined features, the effectiveness of the Cox et al. database search system dramatically declines. For example, if a user focuses on the curvature of the neck of a flamingo in making selections during a search and the system does not include quantification which takes into account the curvature of objects within images, the likelihood of a successful search will be low. Although configuration of a more comprehensive feature set would provide a partial solution to this problem, a truly comprehensive feature set is difficult to obtain because of the near infinite variety of features on which possible viewers may focus in analyzing image content. Furthermore, as the feature set grows larger, the processing requirements of the database search system become prohibitive.
What is needed is a method and a system for searching and retrieving database objects which are capable of associating low level features (xe2x80x9cattributesxe2x80x9d) utilized to characterize objects in the database with high level semantic features to enable effective database searching based on the high level semantic features.
A method and a system for indexing and retrieving database objects (typically images) include utilizing user relevance feedback received during a first object retrieval session to establish similarity correlations among the database objects and among clusters of database objects. The similarity correlations are updated continuously during user interaction with the database and are utilized to select database objects in response to query objects during subsequent iterations of a particular object retrieval session and during subsequent object retrieval sessions. The clusters are preliminarily determined by system-perceived relationships (i.e., similarities among system-quantified features), while after continued use the cluster-to-cluster correlations are indicative of user-perceived relationships among the groups.
In a preferred embodiment, the database objects are organized into clusters such that each cluster includes database objects having similar values for selected quantified features. Each database object is assigned a vector for multiple quantified features and can be mapped to a point within a multi-dimensional feature space according to the feature values associated with the database object. The similarity between clusters of database objects is represented by the distance between clusters in the multi-dimensional feature space. Thus, the initial organization is based exclusively on the system-perceived relationships. In response to a first user-generated query object which includes a set of quantified features, a database manager selects a first set of database objects for presentation to a user. The selected database objects are those objects within a cluster of the database which is closest to the first query object within the multi-dimensional feature space.
The selected database objects preferably also include randomly selected database objects to counteract a tendency of the system to xe2x80x9cover learnxe2x80x9d during a retrieval session and to present to the user database objects which otherwise would have a low probability of being selected during the retrieval session. For example, if the user selects an image of an airplane as relevant during a search for an image of a bird, the system might select only images of airplanes for consecutive iterations of the search. In order to reduce the likelihood that the system will progress along an inaccurate focus, at each iteration in a search random images are selected for presentation along with the other selected images.
The user designates particular database objects to be relevant to the retrieval session and other database objects to be irrelevant. In a preferred embodiment, in response to the user-designations of relevance and irrelevance, an updating mechanism updates the correlation matrix. For example, one of the clusters in the database might include images of yellow objects which are displayed to the user in response to the query image of a yellow car. In response to designations of relevance for images of a truck and a motorcycle and designations of irrelevance for images of a house and a flower, which are all included in a first cluster, the database processor divides the first cluster into two clusters. The first cluster retains the images of the truck and the motorcycle, which were determined by the user to be relevant, and a second cluster is created which includes the images of the flower and the house. Furthermore, other images of the first cluster which might not have been displayed must be segregated into either the first or second cluster. Those non-displayed images which have quantified features more similar to the relevant images will be maintained in the first cluster and those non-displayed images which have quantified features more similar to the irrelevant images will be located into the second cluster. The updating mechanism updates the correlation matrix by assigning a low correlation of similarity to the first and second clusters.
During a database object retrieval session, database objects from multiple clusters might be selected for presentation to the user. If two database objects from separate clusters are both determined by the user to be relevant to the first user-generated query object, the updating mechanism will take steps to ensure that the two clusters have a high correlation value within the correlation matrix. In configuring the clusters, if the database manager did not take into account the feature(s) which lead the user to determine that the two database objects are relevant to the query object, the two database objects will be distant from each other within the multi-dimensional feature space. By assigning a high correlation value to the two clusters, the system is embedding user feedback into the correlation matrix and enabling intelligent retrieval based on non-quantified xe2x80x9chigh level features.xe2x80x9d The correlation values are at least partially determined by user-perceived relationships among the clusters.
Over the course of multiple retrieval sessions, as a result of cluster divisions, the quantity of clusters can increase significantly. Eventually, the number of clusters will approach the number of database objects and the processing resources required to perform database object retrieval will rise accordingly. To counter this tendency, the database manager merges two clusters if a similarity threshold is exceeded. The similarity threshold takes into account the distance between the two clusters in the feature space, as well as the weight of the correlation between the two clusters in the correlation matrix.
An advantage of the invention is that the database can be initialized according to feedback from users to compensate for deficiencies in the original database initialization. A further advantage of the invention is that the correlation matrix enables custom tailoring of the retrieval system in response to user feedback. Yet another advantage of the invention is that the correlation matrix is continually refined through user feedback across multiple sessions.