In a text document search, a user typically enters a query into a search engine. The search engine evaluates the query against a database of indexed documents and returns a ranked list of documents that best satisfy the query. A score, representing a measure of how well the document satisfies the query, is algorithmically generated by the search engine. Commonly-used scoring algorithms rely on splitting the query up into search terms and using statistical information about the occurrence of individual terms in the body of text documents to be searched. The documents are listed in rank order according to their corresponding scores so the user can see the best matching search results at the top of the search results list.
Another evaluation that certain search engines may employ to improve the quality of the results is to modify the rank of the results by a selected ranking function. One exemplary prior art ranking function determines that when one page links to another page, it is effectively casting a vote for the other page. The more votes that are cast for a page, the more important the page. The ranking function can also take into account who cast the vote. The more important the page, the more important their vote. These votes are accumulated and used as a component of the ratings of the pages on the network.
A ranking function is used to improve the quality of the ranking. Ranking functions can rely on combination of content in the document (such as terms contained in the body or metadata of the document), or data contained in other documents about this document (such as anchor text), measures of importance obtained by analyzing the URL graph and other query independent measures of relevance.
Typically, when evaluating the performance of a ranking function a set of users are asked to make relevance judgments on the top N (e.g., 10) documents returned by the search engine with a given ranking function for a given set of evaluation queries. The document corpus and the set of queries are kept fixed, so that performance of different ranking functions may be compared side-by-side eliminating all other variables from the equation. This is typically done in a prototyping (research) environment. A set of relevance judgments may also be obtained from a live system by asking users to volunteer relevance judgments for the search results on an arbitrary set of queries. Relying on relevance judgments to measure the performance allows a ranking function to be optimized by iteratively varying ranking parameters and measuring performance.