Basic web-based content searching techniques are well known. Common examples are readily visible in publicly available Internet searching portals. With the organic growth of content on the Internet, searching techniques are only as good as ability to prioritize or sort content references (e.g. description data and the hyperlink). Additionally, the vast number of searchable content is searched by a limited number of search terms, typically relatively basic terms, thus compounding the relevance concerns when returning search results.
Existing search result generation techniques recognize and incorporate relevance aspects when sorting and prioritizing search results. The sorting and prioritizing is typically a precursor operation to the generation of a search results page, which can be the hypertext markup language (HTML) page with hyperlinks and other content sent to the requesting user. For example, a first search results page may be the first twenty-five links as sorted and prioritized by the search engine. Various engines may use different techniques for sorting and prioritizing the content. The search results page may be one of any number of pages, either limited by the number of search results or system-limited to show only a set number of results, for example the first 500 results.
In existing techniques, the relevance score of a document is calculated solely based on attributes of the document and the query, such as term statistics, site authority, document-query similarities, etc. The term documents, as used herein, refers generally to any suitable type of content that is accessible and viewable through the Internet, including HTML-encoded documents, proprietary-encoded document (e.g. PDFs), audio and/or video files, images, etc.
Existing systems discard any information associated with a user selection of, e.g. clicking, a particular link. Rather, these systems merely accept the user selection as a retrieval or a redirection command and do not utilize this information for additional calculations. User selection is in-effect an indication of the users' implicit relevance of the document because if the user selects the document, it can be inferred as having a greater degree of relevance to the original query. This user selection is a direct feedback mechanism ignored by existing searching systems. The implicit relevance of a user selection is especially true for image and video searches because the users are presented with thumbnails and descriptions of the image and/or videos within the search results page. If users click a video item frequently for a query, this video has a very high probability to be relevant to the query.
Therefore, there exists a need for an improved search result generation technique that models and incorporates the user selection activity in improving relevance calculations, as well as using the model learned to derive an improved relevance calculations for generating improved search result pages.