Search applications (commonly referred to as search engines) are designed to retrieve documents from a database based on a query containing one or more search terms. In particular, conventional search applications may retrieve documents having at least one document property matching the terms in the query. The document property may include text contained in the document. Examples of text contained in the document include the title and the body of the document. The document property may further include text that is not contained in the document. An example of text that is not contained in the document is anchor text, which is the text contained in a hyperlink to the document.
In order to return more relevant documents, the search application may rank a set of candidate documents (i.e., the set of documents satisfying the query) according to their predicted relevance. The search application may then return the documents in order of their relevance. For example, the search application may return the documents in order from the most relevant to the least relevant. If a large number of documents are retrieved in response to a given query, the search application may also return only a subset of the candidate documents that are the most relevant. In this way, the user can more efficiently analyze the returned documents.
The search application may rank documents according to a ranking function, which may associate a ranking score for each candidate document. For example, a simple ranking function may attach a greater relevance to documents in which the search terms are contained in the titles of the documents. The quality of ranked documents returned by the ranking function depends largely on the ranking function itself, as well as the ranking features (i.e., the inputs) entered into the ranking function. Thus, the quality of the ranked documents may be improved by also improving the ranking features entered into the ranking function.
It is with respect to these considerations and others that the disclosure made herein is presented.