In a computer system containing a database supporting similarity searches, the term “query-by-example” denotes a paradigm where the user specifies a query by providing examples, e.g., as described in M. Flickner et al., “Query By Image and Video Content: the QBIC System,” IEEE Computer Magazine, 28(9), pp. 23-32, September 1995, the disclosure of which is incorporated by reference herein. In a multimedia database, an example can be an existing image or video, or a sketch drawn by the user. In a traditional database, an example can be a particular record.
There are at least two types of queries that can be specified by means of an example:
(1) best-k-matches: Here, the search engine must return the k database items that more closely match the concept specified by the user. The search engine uses the example to compute quantities (features), that are compared to the features of the items stored in the database. In multimedia databases, typical features include color, texture and shape. Given two items in the database, the one with more similar features to the query is considered the better match.
(2) threshold search: Here, the search engine returns all the items in the database that are more similar than a specified similarity level to the concept described by the user. The difference from the previous type of query resides in the similarity function, which is known (to a certain extent) to the user.
A similarity search using a single example is only moderately effective. To improve retrieval performance, the user can provide multiple examples, as described in X. Wan et al., “Efficient Interactive Image Retrieval with Multiple Seed Images,” Proc. SPIE, vol 3527: Multimedia Storage and Archiving III, pp. 13-24, November 1998, the disclosure of which is incorporated by reference herein. Multiple examples are used to estimate the relative importance of the different features used in the retrieval, which translates into giving different weights to different features while computing similarity. As disclosed in X. Wan et al., the distance between the query vector x and a target vector y is then computed as:d2(x,y)=Σwi (x[i]−y[i])2,  (1)where the sum is over the M features in the feature vectors, and the w are different weights.
For instance, if there are N positive examples, having feature vectors x1, . . . , xN, a possible choice of weight for the ith feature is w[wi]=[(x1[i]2+x2[i]2+ . . . +xN[i]2)/N]−1/2, where xj[i] is the value of the ith feature for the jth example. These examples can be positive (examples of the desired content) or negative (examples of undesired content). X. Wan et al. describes how to use both positive and negative examples to compute the weights wi.
Other distance metrics that can be used instead of the distance metric denoted in equation (1) include weighted Lp distances, computed as:dp(x,y)=(Σwi(x[i]−y[i])p)1/pand quadratic distances, computed as:d(x,y)=(x−y)TK−1(x−y)where K is a non-singular, positive-definite matrix, the −1 superscript denotes the matrix inverse operator, and the T superscript denotes transposition.
Most authors disclose how to solve the similarity search problem in a classification-like setting: they divide the database into classes, by learning from the user's input, as in B. Bhanu et al., “Learning Feature Relevance and Similarity Metrics in Image Databases,” Proc. IEEE Workshop on Content-Based Access of Image and Video Libraries, pp. 14-18, 1998; W. Y. Ma et al., “Texture Features and Learning Similarity,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 425-430, 1996; and J. Cox et al., “The Bayesian Image Retrieval System, PicHunter: Theory, Implementation and Psychophysical Experiments,” IEEE Trans. on Image Processing, 9(1), pp. 20-37, January 2000, the disclosures of which are incorporated by reference herein.
Static approaches can be used, where the user's feedback is not allowed. For example, W. Y. Ma et al. (in the above-referenced W. Y. Ma et al. article) discloses how to partition the database into clusters using a neural network approach. The W. Y. Ma et al. method requires labeling each database entry with a class label, hence, it is tantamount to providing all the entries in the database as examples. The method is static since it produces a fixed pseudo-distance on the search space. This pseudo-distance does not depend on the individual queries, and cannot be changed by a user. The computational cost of the approach is prohibitive for large databases.
Dynamic approaches, such as described in the above-referenced B. Bhanu et al. article, learn on-line, i.e., at query time. Mixed approaches, such as described in M. E. J. Wood et al., “Iterative Refinement by Relevance Feedback in Content-Based Digital Image Retrieval,” Proc. of the sixth ACM Int. Conf. Multimedia, pp. 13-20, 1998, the disclosure of which is incorporated by reference herein, use both query-time and off-line learning.
However, a problem with the classification-like approach is the lack of evidence regarding the existence of semantic classes associated with similarity retrieval. In other words, there is no evidence that the user is interested in a particular semantic concept when retrieving data based on similarity. Inferring the existence of such classes from the user's example reduces the flexibility of the system. The approach is valuable only if the user asks over and over for the same content, which is a rare occurrence.
In the relevance-feedback literature, authors propose solutions on how to use sets of positive and negative examples that are iteratively provided as input to the system. They assume that the user provides a set of positive examples or a set of negative examples (or both) at each iteration. The area of relevance feedback has been studied for the past 30 years. There are two main categories of relevance-feedback techniques from the viewpoint of how the system deals with examples provided during different iterations: (1) static query rewrite, where at each iteration the same weights are given to all the examples irrespective of when they were added to the query, e.g., R. Yong et al., “Relevance Feedback: A Power Tool For Interactive Content-Based Image Retrieval,” IEEE Trans. on Circuits and Systems for Video Technology, 8(5), pp. 644-655, September 1998, the disclosure of which is incorporated by reference herein; and (2) time-weighted query rewrite, where the system gives more importance to the most recently provided examples.
Relevance-feedback techniques can also be divided in two categories, depending on how the examples are used: (1) techniques that use adaptive ranking functions use the positive and negative examples to modify the weights of distance functions, such as in equation (1) or variations, and are suited for on-line searches (e.g., the above-referenced R. Yong et al. article); and (2) techniques that perform feature-space warping actually change the structure of the search space in non-linear fashions, are computationally very expensive, and are best suited for off-line learning, e.g., U.S. patent application identified by Ser. No. 09/237,646 filed on Jan. 26, 1999 and entitled “Method and Apparatus for Similarity Retrieval from Iterative Refinement;” and C. -S. Li et al., “Sequential Processing for Content-Based Retrieval of Multimedia Objects,” Proc. SPIE, vol. 3312 Storage and Retrieval for Image and Video Databases IV, pp. 2-13, January 1998, the disclosures of which are incorporated by reference herein. It is to be understood that “on-line” describes operations performed substantially contemporaneous with receipt of the user query (e.g., real-time or interactive), while “off-line” describes operations that are not performed on-line (e.g., not supporting an interactive mode of operation). There is, therefore, a need to allow the user to simultaneously provide multiple sets of positive and negative examples and to use them in an on-line setting.
The process of searching a database is complex and time consuming. Data structures, called indexes or indexing structures, are used to speed up the process. In particular, multidimensional access methods simultaneously index several variables, for instance, as described in V. Gaede et al., “Multidimensional Access Methods,” ACM Computing Surveys, 20(2), pp. 170-231, June 1998, the disclosure of which is incorporated by reference herein. Multidimensional indexing methods are used for point queries and range queries, as disclosed in the above-referenced V. Gaede et al. article, and for nearest-neighbor queries, as disclosed in B. S. Kim et al., “A Fast K Nearest Neighbor Algorithm Based on the Order Partition,” IEEE Trans. Pattern Analysis and Machine Intelligence, PAMI-8(6), pp. 761-766, November 1986, the disclosure of which is incorporated by reference herein, but not for similarity queries based on multiple example sets. There is, therefore, a need for indexing structures supporting similarity queries based on multiple positive and negative example sets.