In learning to rank for information retrieval, a ranking model is constructed with training data consisting of queries, their corresponding retrieved documents, and relevance levels given by humans. In ranking, given a new query, the retrieved documents are ranked by using the trained ranking model.
In Information Retrieval (IR), usually ranking results are evaluated in terms of evaluation measures such as MAP (Mean Average Precision) and NDCG (Normalized Discounted Cumulative Gain). Ideally a learning algorithm would train a ranking model by optimizing the performance in terms of a given evaluation measure. In this way, higher accuracy in ranking can be expected. However, this is usually difficult due to the non-continuous and non-differentiable nature of the IR measures.
Many learning to rank algorithms proposed so far typically minimize a loss function loosely related to the IR measures. For example, Ranking SVM and RankBoost minimize loss functions based on classification errors on document pairs.
Recently, researchers have developed several new algorithms that manage to directly optimize the performance in terms of the IR measures. For example, SVMmap and AdaRank minimize loss functions based on the IR measures.
There are still open questions regarding to the direct optimization approach. (1) Is there a general theory that can guide the development of new algorithms? (2) What is the relation between existing methods such as SVMmap and AdaRank? (3) Which direct optimization method performs the best empirically?