Image search refers to an information retrieval process whereby a user enters a natural language query, for example, a query entered via a text field provided by a search engine; an image collection is searched and a sorted image result according to relevance and other parameters is returned. The relevance is one of the major performance parameters of a search engine, and measures the degree of relevance between a returned result and a user's query need. Images returned by an image search engine are in a structure-less pixel format, while queries entered by the user are in a text format. These two completely different information formats cannot be put into computation directly.
Currently, relevance characteristics of image search are described mainly using the following three approaches: 1. a text matching characteristic, which is obtained by comparing image surrounding text with a query; 2. a classification matching characteristic, which is obtained by comparing a classification label with the query, the classification label is obtained by classifying image content; and 3. a click-through rate characteristic, which is a measure of relevance between a specific image and the query obtained by conducting statistics on click behaviors and the like of a large number of user queries.
The above three methods for describing a relevance characteristic of image search all have limitations:
For the characteristic text matching characteristic: the surrounding text of the image may be inconsistent with the image content, and cannot completely and accurately describe the content of the image in many cases, thus affecting the accuracy of the text matching characteristic.
The classification matching characteristic is limited by the integrity of a category system and the correctness of a classification model. Generally, the finer the category system is, the more difficult is the classification, the less accurate becomes the classification model, the more semantically deviated from the query text is the classification result, and the more difficult matching becomes. However, if the category system is too rough, the matching with the query is not precise enough. Therefore, this characteristic generally only plays an auxiliary role.
The click-through rate characteristic is mainly based on user behavior statistics, has biases and noises on one hand, and sparsity on the other hand. Sufficient click statistics can only be collected from images presented at the top and with sufficient occurrences after frequent queries, while in other cases, no click statistics can be collected, or clicks are very sparse and lack statistical significance.