Ranking is widely used in computer systems. Any application that returns a rank ordered set of results to a user is performing some type of ranking operation. Therefore, learning to rank can be a key component for a number of different applications.
For instance, one example of an application where ranking items is useful is information retrieval. Of course, ranking is useful in a wide variety of applications and information retrieval is only one example. In any case, information retrieval search engines often return a set of documents or uniform resource locators (URLs) to a user in response to a query. For the sake of the present discussion, the results will be referred to in terms of URLs. This is not to limit the invention in any way, and the results could just as easily be other items such as documents, audio clips, or any other results desired by a user, whether search results provided in response to an information retrieval query, system, or some other type of query.
In any case, when URLs are retuned in response to a query in an information retrieval context, the URLs are ordered by relevance. In order to train a component to rank URLs by relevance to a query, training data is provided as a set of query/URL pairs and a feature vector describing characteristics of the query, URL, page, and other characteristics such as user behavior. This training data is used to learn a ranking function. The function could output a score, for a given feature, with the ranking performed by sorting by that score; or it could output the rank value directly (for example, a positive integer, with ‘1’ meaning top position); or it could output some other structure encoding the rank.
The training data is, in general, labeled by human beings, where the label indicates the quality of the particular URL, given the query. In one example of a ranking system, there are five labels used and they rate a URL as follows (the rating is indicative of how relevant the URL is to the query):
0=bad, 1=fair, 2=good, 3=excellent, 4=perfect.
Also, in most systems, the utility function typically used to assess the quality of a ranking algorithm are very hard to optimize directly. This is in part due to the fact that the utility function depends on the sorted order of the URLs (for a given query) and not directly on the scores, or labels, for the query.
One example of such a utility function is referred to as normalized discounted cumulative gain (NDCG). The NDCG utility function measures the ranking quality at and above a given rank level. For instance, NDCG at 10 gives a measure appropriate to the top 10 ranked URLs given a query. For a ranking function that outputs a score, which is used to rank the items by sorting, the NDCG function is either flat or discontinuous everywhere. Since most learning algorithms require a smooth utility function, the characteristics of the NDCG utility function present a very difficult problem for common learning algorithms.
The discussion above is merely provided for general background information and is not intended to be used as an aid in determining the scope of the claimed subject matter.