Search engines are now commonplace in many software applications. Search engines may be used for searching for text strings in applications such as word processors, for searching for help in sophistical software as varied as spreadsheets and operating systems, and for searching for URLs, references and other documents in web-based search engines. The effectiveness of any one search may be abstractly judged by whether the top few returned documents are the documents actually sought by the user. The returned list should preferably be sorted by relevance to the user in the context of the search terms that were used. This ordering of documents makes it easier for a user to select the document that he believes has the greatest relevance to his search.
A search engine may be used to generate a list of documents such that the documents have a relation to search terms. Since databases of documents can be large and since any one search engine may have access to multiple document databases, the volume of documents retrieved by a search may also be large. Ranking the documents according to some relevance criteria is one way to assist the user in finding his preferred document.
A great number of information retrieval systems, such as search engines, use a probabilistic ranking algorithm, such as the well known OKAPI algorithm, for ranking a retrieved set of documents resulting from a search. OKAPI is the name given to a family of retrieval systems that have been developed over the past few decades. The OKAPI-type systems are based on the Robertson-Sparck Jones probabilistic model of searching. OKAPI originated as an on-line library catalogue system and since has been used as the basis for services in various contexts.
It is desirable for a search engine to have a capability to rank documents in a way that allows the user to easily find the most relevant document with respect to the search. Otherwise, the user may be overwhelmed by the amount of unsorted information presented. Thus, there is a need for a method of ranking retrieved documents to provide a means to improve the accuracy of a search tool to pinpoint the most relevant documents of interest in a set of retrieved documents.