The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
Search providers, such as Yahoo, Inc., provide search results to users in response to queries submitted by those users. Because search results may indicate hundreds or thousands of matching documents—i.e. “hits”—for a given query, it is usually helpful to sort those documents by relevance to the query. One technique for sorting documents is to rank the documents according to relevance scores calculated for each document. Search results that have been sorted in this fashion are hereinafter described as “ranked search results.”
One problem with generating ranked search results is that it is difficult to determine meaningful relevance scores for each document indicated by the search results. One approach for determining relevance scores relies on human editorial judgments. For example, the search provider may ask a person or group of persons to determine relevance scores for various documents matching a particular query. Unfortunately, obtaining human editorial judgments for every possible hit for every possible query that may be submitted to a search engine is prohibitively expensive, particularly as documents are continuously modified and/or added to a search repository. Moreover, human editorial judgments are prone to well-known errors and biases.
Some approaches for generating relevance scores rely on a ranking function (also known as a relevance function) instead of or in addition to human editorial judgments. Generally speaking, a ranking function accepts a document and/or features thereof as input. A feature is a quantification of an aspect of a document or of the relationship of a document to a query. Given a document and, in some cases, a query, a feature returns a value. Example input features may include, for example, the number of times a search term from a query appears in a document, the location in which the search terms appear in that document, the proximity of one search term to another in a document, a likelihood that the document is “spam,” term weights, the URL depth of the document, the source of the document, the authority of the document, and so on. Based on this input, the ranking function calculates a relevance score.
Because ranking functions may rely on tens or hundreds of input features, it is difficult to determine ranking functions that reliably approximate relevance, especially as relevance changes over time. One proposed solution for increasing the effectiveness and adaptability of a ranking function is to utilize click-through information to generate features for the relevance function. Click-through information indicates, for a particular query, which documents indicated in search results for that particular query were accessed by users who issued the particular query (i.e. which documents users “clicked” on). In essence, the solution teaches that one may calculate a relevance score for a document to a query based, in part, on the click-through information available for that particular document. Such a technique is described in U.S. Patent Publication 2007/0255689 A1, by Sun et al., published on Nov. 1, 2007 and entitled “System and method for indexing web content using click-through features,” the entire contents of which are hereby incorporated by reference for all purposes as if fully set forth herein.
However, even utilizing click-through information, ranking functions are often unable to approximate the effectiveness of human editorial judgments in producing relevance scores. This ineffectiveness is due, in part, to problems in existing models for predicting relevance based on click-through information—particularly the well-known problem of positional bias. Positional bias, in short, refers to the tendency of users to pay attention to highly positioned documents in a set of search results while ignoring other documents in that set of search results, even though the other documents may be more relevant than the highly-positioned documents. The difficulty of overcoming this problem is discussed in, for example, N. Craswell, et al., “An experimental comparison of click position-bias models,” in Proceedings of the international conference on web search and web data mining, pages 87-94, ACM 2008, the entire contents of which are hereby incorporated by reference for all purposes as if fully set forth herein.
Many approaches for generating relevance scores rely on a “learned” ranking function. Rather than utilizing a static, human-determined ranking function, one may configure a search system to “learn” a ranking function using various machine learning techniques. Using the same machine learning techniques, one may continuously adapt the ranking function as time goes on. Generally speaking, these techniques involve training the search system what constitutes relevance by giving the search system various training sets of documents for which rankings are already known. For example, rankings may be known for a training set because the search provider has collected editorial judgments of the relevance of each of the documents in the training set to their associated query. The search system then uses a classifier, such as a neural network or decision tree, to iteratively refine a function of document features. The result of this process is a ranking function whose calculated relevance scores maximize the likelihood of producing the “target” rankings—i.e. the known rankings for each of the training sets of documents. This ranking function may then be used to compute relevance scores for documents whose relevance scores are not known.
Techniques for learning a ranking or relevance function are described in, for example, C. Burges, et al., “Learning to rank using gradient descent” in Proceedings of the 22nd international conference on Machine learning, pages 89-96, 2005; Z. Zheng, et al., “A general boosting method and its application to learning ranking functions for web search.” in Advances in Neural Information Processing Systems 20, pages 1697-1704, MIT Press 2008; U.S. Pat. No. 7,197,497 to Cossok, et al., entitled “Method and apparatus for machine learning a document relevance function” and issued Mar. 27, 2007; and U.S. patent application Ser. No. 11/863,453 by Olivier Chapelle, filed Sep. 28, 2007, and entitled “Gradient Based Optimization of a Ranking Measure,” the entire contents of each of which are hereby incorporated by reference for all purposes as if fully set forth herein.
However, learned ranking functions still often yield unsatisfactory results. This problem results from, among other factors, imperfections in the known rankings (for example, human errors and biases) and limitations on the size and number of training sets available. Furthermore, while learned ranking functions typically only require human editorial judgments for a small portion of queries and documents, learned ranking functions still typically require editorial judgments, which can be difficult and expensive to obtain.
It is therefore desirable to provide more efficient techniques for generating a ranking function. It is furthermore desirable to provide more efficient techniques for determining the relevance of a document to a particular query. It is furthermore desirable to overcome the problems of positional bias when utilizing click log information to model relevance.