Search engines may use ranking functions to determine an order in which documents are presented in response to a received query. Ranking functions may take a number of document features as input and provide a set of document rankings as output. Features may be described as attributes of a document that may be used by a ranking function to determine the rank of a particular document for a particular query.
Modern search engines may use a large number of features to rank documents, such as PageRank, term frequency, document length, etc. When a large number of features is used by a search engine, learning to rank (e.g., machine learning) may be an effective solution for building a ranking function model. To build and refine the model, learning solutions may use document labeling in which a human operator gives a score to each of the documents associated with a query on a scale ranging from “relevant” to “irrelevant”. Such labeling efforts may be a time-consuming and expensive. Improperly limiting the number of documents used in the training process may decrease the effectiveness of the learned model.