Image search refers to an information retrieval process of entering a natural language query by a user, for example, a query entered via a text field provided by a search engine, searching an image set, and returning image results sorted by an index, such as correlation, to the user. The correlation, as one of the major performance indexes of a search engine, measures the degree of correlation between a returned result and the user's query requirements. Images returned by an image search engine are in a structureless pixel format, while queries entered by users are in a text format. These two information formats are completely different, and cannot be directly computed.
Currently, correlation characteristics of image search are described mainly by the following three approaches: 1. a text matching characteristic, whereby the correlation is obtained by comparative calculation on the surrounding text of an image and a query; 2. a classification matching characteristic, whereby the correlation is obtained by comparative calculation on a classification label and the query, and the classification label is obtained by classifying the image content; and 3. a click-through rate characteristic, whereby the correlation is a measure of correlation between a specific image and the query obtained by collecting statistics on click behaviors and the like of a large number of users after querying.
The above mentioned three methods of describing the correlation characteristics of image search all have some limitations:
For the characteristic text matching characteristic: the surrounding text of the image may be inconsistent with the image content, and cannot completely and accurately describe the content of the image in many cases. The accuracy of the text matching characteristic is therefore affected.
The classification matching characteristic is limited by the integrity of a categorizing system and the accuracy of a classification model. Generally, as the fineness of the categorizing system increases, classification becomes more difficult, and the classification model becomes less accurate. Moreover, a classification result is more semantically deviated from a query text, and matching becomes more difficult. However, if the category system is too rough, the matching with the query is not precise enough. Therefore, this characteristic generally only plays an auxiliary role.
The click-through rate characteristic is mainly based on statistics on user behaviors, has biases and noises on one hand, and sparsity on the other hand. Sufficient click statistics can be collected only from images presented at the front and presented for sufficient times under high-frequency queries, while in other cases, no click statistics can be collected, or clicks are very sparse and lack statistical significance.