The prevalence of Internet commerce, social networking, and web search in recent years has produced a wealth of data about the preferences of individual users of such services. Various solutions have been inspired by work in the fields of statistics and machine learning that provide automated mechanisms to find exploitable patterns in such data. The exploitable patterns may be used to ultimately provide better recommendations of products, services, information, social connections, and other options or items to individuals (or groups of individuals). The increased quality of recommendations provided improves user experience, satisfaction, and uptake of these services.
With the abundance of preference data from search engines, review sites, etc., there is tremendous demand for learning detailed models of user preferences to support personalized recommendation, information retrieval, social choice, and other applications. Much work has focused on ordinal preference models and learning user or group “rankings” of items. Two classes of models are distinguishable. A first model may wish to learn an underlying objective (or “correct”) ranking from noisy data or noisy expressions of user preferences (e.g., as in web search, where user selection suggests relevance). A second model may assume that users have different “types” with inherently distinct preferences, and aim to learn a population model that explains this diversity. Learning preference types (e.g., by segmenting or clustering the population) can be critical to effective personalization and preference elicitation: e.g., with a learned population preference distribution, choice data from a specific user allows inferences to be drawn about her preferences.
One aspect of research in this domain has focused on leveraging product ratings (typically given on a small, numerical scale), and users' profile data to predict the missing ratings or preferences of individual users (e.g., how much will user A like a movie M that she has not yet seen). This is known as “collaborative filtering”, because the prediction algorithms aggregate the collective, and usually partial, preferences of all users. These approaches take into the account the diversity of preferences across users. See for example, papers “Probabilistic Matrix Factorization” by R. Salakhutdinov and A. Mnih, Neural Information Processing Systems 2008 and “Learning from incomplete data” by Z. Ghahramani and Michael I. Jordan, MIT Artificial Intelligence Memo No. 1509. There are a variety of commercially relevant recommender systems based on collaborative filtering.
Considerable work in machine learning has exploited ranking models developed in the statistics and psychometrics literature, such as the Mallows model (Mallows, 1957), the Plackett-Luce model (Plaskett, 1975; Luce, 1959), and others (Marden, 1995). This work involves learning probability distributions over ranking preferences of a user population. The models investigated in this line of research are usually derived from models proposed in the psychometric and statistics literature and include the Mallows model, the Plackett-Luce model, the Thurstonian model and several others (See: J. I. Marden, “Analyzing and Modeling Rank Data”, Chapman and Hall, 1995). The Mallows model has attracted particular attention in the machine learning community.
However, research to date provides methods for learning preference distributions using very restricted forms of evidence about individual user preferences, ranging from full rankings, to top-t/bottom-t items, to partitioned preferences (Lebanon & Mao, 2008). Missing from this list are arbitrary pairwise comparisons of the form “a is preferred to b.” Such pairwise preferences form the building blocks of almost all reasonable evidence about preferences, and subsumes the most general evidential models proposed in the literature. Furthermore, preferences in this form naturally arise in active elicitation of user preferences and choice contexts (e.g., web search, product comparison, advertisement clicks), where a user selects one alternative over others (Louviere et al., 2000).
While learning with pairwise preferences is clearly of great importance, most believe that this problem is impractically difficult: so, for instance, the Mallows model is often shunned in favour of more inference-friendly models (e.g., the Plackett-Luce model, which accommodates more general, but still restrictive, preferences (Cheng et al., 2010; Guiver & Snelson, 2009)). To date, no methods been proposed for learning from arbitrary paired preferences in any commonly used ranking model.
Examples of relevant prior art include: Amazon.com, which recommends consumer products based on past purchases, product details viewed and other relevant features; and Netflix.com, which recommends movies primary based on movie ratings on a predefined scale.
Another aspect that has been the subject of prior art research is finding an objective, or ground truth, and ranking of items based on expert relevance ratings or (noisy) user feedback in the form of comparisons on pairs of items. Algorithms for this problem have typically been applied in the domain of web search engines, where an objective ranking must be outputted for a given user search query. Some relevant papers on this subject are referenced below and in the paper Tyler Lu & Craig Boutilier, “Learning Mallows Models with Pairwise Preferences.” Notably, such algorithms have been applied in large commercial search engines such as Google™, and Microsoft Bing™.
Much of the prior art has focused on learning (i.e., inferring parameters) for such models or mixtures thereof (i.e., several Mallows distributions combined together, each forming a cluster) given very restrictive forms of preferences used as evidence/observations from which the model is to be learned. Existing prior art techniques require, for example, that observations of user preferences take the form of a full ranking, a partial ranking consisting of the top few items, and other such variations. Relevant prior art references include the following:    Burges, C. From ranknet to lambdarank to lambdamart: An overview. TR-2010-82, Microsoft Research, 2010.    Busse, L. M., Orbanz, P. and Buhmann, J. M. Cluster analysis of heterogeneous rank data. ICML, pp. 113-120, 2007.    Cheng, W., Dembczynski, K., and Hüllermeier. Label ranking methods based on the Plackett-Luce model. ICML-10, pp. 215-222, Haifa, 2010.    Dokgnon, J., Pekec, A., and Regenwetter, M. The repeated insertion model for rankings: Missing link between two subset choice models. Psychometrika, 69(1):33-54, 2004.    Dwork, C., Kumar, R., Naor, M., and Sivakumar, D. Rank aggregation methods for the web. WWW-01, pp. 613-622, Hong Kong, 2001.    Guiver, J. and Snelson, E. Bayesian inference for Plackett-Luce ranking models. ICML-09, pp. 377-384, 2009.    Kamishima, T., Kazawa, H., and Akaho, S. Supervised ordering: an empirical survey. IEEE Data Mining-05, pp. 673-676, 2005.    Lebanon, G. and Mao, Y. Non-parametric modeling of partially ranked data. J. Machine Learning Research, 9:2401-2429, 2008.    Louviere, J., Hensher, D., and Swait, J. Stated Choice Methods: Analysis and Application. Cambridge, 2000.    Luce, R. D. Individual choice behavior: A theoretical analysis. Wiley, 1959.    Mallows, C. L. Non-null ranking models. Biometrika, 44:114-130, 1957.    Marden, J. I. Analyzing and modeling rank data. Chapman and Hall, 1995.    Murphy, T. B. and Martin, D. Mixtures of distance-based models for ranking data. Computational Statistics and Data Analysis, 41:645-655, 2003.    Neal, R. and Hinton, G. A view of the EM algorithm that justifies incremental, sparse, and other variants. In Jordan, M. (ed.), Learning in Graphical Models, pp. 355-368. MIT Press, Cambridge, Mass., 1999.    Plackett, R. The analysis of permutations. Applied Statistics, 24:193-202, 1975.    Young, P. Optimal voting rules. J. Economic Perspectives, 9:51-64, 1995.