Improvements in computer hardware and technology coupled with the multiplication of connected mobile electronic devices have spiked interest in developing solutions for task automatization, outcome prediction, information classification and learning from experience, resulting in the field of machine learning. Machine learning, closely related to data mining, computational statistics and optimization, explores the study and construction of algorithms that can learn from and make predictions on data.
The field of machine learning has evolved extensively in the last decade, giving rise to self-driving cars, speech recognition, image recognition, personalization, and understanding of the human genome. In addition, machine learning enhances different information retrieval activities, such as document searching, collaborative filtering, sentiment analysis, and so forth.
Machine learning algorithms (MLAs) may generally be divided into broad categories such as supervised learning, unsupervised learning and reinforcement learning. Supervised learning consists of presenting a machine learning algorithm with training data consisting of inputs and outputs labelled by assessors, where the goal is to train the machine learning algorithm such that it learns a general rule for mapping inputs to outputs. Unsupervised learning consists of presenting the machine learning algorithm with unlabeled data, where the goal is for the machine learning algorithm to find a structure or hidden patterns in the data. Reinforcement learning consists of having an algorithm evolving in a dynamic environment without providing the algorithm with labeled data or corrections.
Search engines are now widely used for performing information searching and retrieval, which allow documents to be identified, ranked in response to user queries, and then supplied to the users. Learning to rank (LTR) is the application of machine learning in the construction of ranking models for information retrieval and is a common search engine tool for ranking documents in response to user queries. Generally, a system may maintain a pool of documents, where a ranking model may rank documents responsive to a query, and then returns the most relevant documents. The ranking model may have been previously trained on training documents. As stated previously, the sheer volume of documents available on the Internet combined with its continuous growth makes labeling not only difficult, but requires a lot of computational and monetary resources, as it is often performed by human assessors. Furthermore, the labels assigned by human assessors to a given document may be prone to errors.
U.S. Pat. No. 8,935,258 issued on Jan. 13, 2015, to Svore et al. teaches identifying sample data items having the greatest likelihood of being mislabeled when previously judged, and selecting those data items for re-judging. In one aspect, lambda gradient scores are summed for pairs of sample data items to compute re-judgment scores for each of those sample data items. The re-judgment scores indicate a relative likelihood of mislabeling. Once the selected sample data items are re-judged, a new training set is available, whereby a new ranker may be trained.
Additionally, in some instances, ranking models employed by search engines evaluate document relevance based on previous user interactions or feedback associated with documents. Therefore, sparseness of data associated with previous user interactions or feedback limits the pool size and variety of training data used during training of certain rank models and, in turn, may limit the capabilities of search engines to effectively rank some documents according to their relevance to the user query.
U.S. Patent Publication No. 2012/0109860 A1 by Xu et al. teaches that training data is used by learning-to-rank algorithms for formulating ranking algorithms. The training data can be initially provided by human judges, and then modeled in light of user click-through data to detect probable ranking errors. The probable ranking errors are provided to the original human judges, who can refine the training data in light of this information.
For the foregoing reasons, there is a need for methods and systems for identifying potentially erroneously ranked documents by a machine learning algorithm.