Information retrieval (IR) is the science of searching for documents, for information within documents, and for metadata about documents, as well as of searching relational databases and the Internet. Internet search engines are the most visible type of IR applications. IR applications use ranking models that are produced by algorithms that are trained to rank identified information sources (such as documents, urls, etc.). These algorithms are commonly called “learning to rank algorithms”.
Learning to rank algorithms automatically construct ranking models from training data. The training data is used by the learning to rank algorithms to produce a ranking model which determines the relevance of information sources to actual queries. The purpose of the ranking model is to rank unseen lists of information sources in a manner that is similar to rankings that are present in the training data. Conventional learning to rank algorithms include lambda gradient type learning to rank algorithms among others.
Lambda gradient type learning to rank algorithms determine “lambdas” or “gradients” for identified information sources or “results” and use the gradients to improve the ranking model during training of the learning to rank algorithm. The gradients are associated with the results and indicate a direction and extent to which a result in a ranked ordering of results is desired to move within the ranked ordering. Lambda gradient type learning to rank algorithms are trained iteratively, and at each iteration, the gradients (lambdas) are re-calculated after results in a ranked ordering of results have been sorted, based on the scores assigned by the model at the current training iteration.
The gradients are determined by pairing individual results in a sorted list of results with other results in the sorted list of results and determining the contribution of the individual results to each of the pairings. The contributions (which can be positive or negative) of an individual result to each of its pairings are summed to obtain a gradient for that result. More formally, where a given feature vector is called y, then the gradient at y is the derivative of a cost function with respect to the ranking model score, evaluated at y.
The gradients are utilized during a given training iteration as follows, where documents D1 and D2 are results in a ranked ordering of results that have gradients X determined for them, and D2 is more relevant than D1, by virtue of the determination of the aforementioned gradients, D1 will get a push downward (in the ranked ordering of results) of magnitude |X| and D2 will get a push upward of equal and opposite magnitude. However, where D2 is less relevant than D1, D1 will get a push upward (in the ranked ordering of results) of magnitude |X| and D2 will get a push downward of equal and opposite magnitude.
Ranking quality measures or “metrics” may be used to determine how well a learning to rank algorithm is performing on training data and to compare the performance of different learning to rank algorithms. Ranking quality measures include Mean Reciprocal Rank (MRR), Mean Average Precision (MAP), Expected Reciprocal Rank (ERR) and Normalized Discounted Cumulative Gain (NDCG). These metrics generate a score that provides a measure of the ranking quality of the learning to rank algorithm. In many training applications, learning to rank problems are formulated as optimization problems with respect to one of the metrics, where training is continued until improvement in the score provided by the metric has been maximized.
Training learning to rank algorithms using conventional methodologies has some significant shortcomings. For example, some learning to rank algorithms may assign a particular relevance label (e.g., relevant, not as relevant, not relevant) to more than one result without adequate means of distinguishing the results that are assigned the same relevance label. In addition, some learning to rank algorithms have inadequate mechanisms for accurately gauging user intent. Accordingly, the effectiveness of the ranking models that are generated from such algorithms can be limited.