This specification relates to digital information retrieval, and particularly to processing search results.
The Internet enables access to a wide variety of resources, such as video or audio files, web pages for particular subjects, book articles, or news articles. A search system can identify resources in response to a search query that includes one or more search terms or phrases. The search system ranks the resources based on their relevance to the search query and resource importance and provides search results that reference the identified resources. The search results are typically ordered according to a rank score that is assigned to each of the resources based on the relevance of the resource to the search query.
The relevance of a resource to a search query can be determined, in part, based on the textual content of the resource or textual content associated with the resource. For example, text included in the content of a resource can be compared to the search query to determine whether the resource is relevant to the search query. In turn, rank scores can be assigned to the resources based on the relevance determination and the resources can be ordered, in part, based on the rank scores.
While textual features associated with a resource can provide information by which a search system can determine the relevance of the resource to the search query, some resources do not contain much, if any, textual content that can be used to accurately determine the relevancy of the resource. Similarly, textual content associated with the resource may be misleading as to the relevance of the image to the search query and can lead to inconsistent relevance data. An image is an example of a resource that may not be associated with textual content that facilitates accurate determination of the relevance of the image to the search query.