A. Field of the Invention
Implementations consistent with the principles of the invention relate generally to information retrieval and, more particularly, to improving results of search engines.
B. Description of Related Art
Search engines assist users in locating desired portions of information from a document corpus. A general web search engine, for instance, catalogs web pages and, in response to a user's request, returns the answer directly or to a set of references to documents relevant to the request. A search engine may also be a more specialized search engine, such as a local search engine, which, given a search request and a geographic location, returns results, such as business listings, that are relevant to the search request and that are located near the geographic location.
Search engines may base their determination of relevance on search terms (called a search query) entered by the user. The goal of the search engine is to identify high quality relevant results based on the search query. Typically, the search engine accomplishes this by matching the terms in the search query to a corpus of pre-stored documents. Documents that contain the user's search terms are considered “hits” and are returned to the user. The set of hits is typically very large and needs to be prioritized or ranked before being returned to the user.
The hits returned by the search engine are typically sorted based on relevance to the user's search terms. Determining the correct relevance, or importance, of a document to a user, however, can be a difficult task. For one thing, the relevance of a document to the user is inherently subjective and depends on the user's interests, knowledge, and attitudes. There is, however, much that can be determined objectively about the relative importance or quality of a document. One existing technique of determining relevance is based on matching a user's search terms to terms indexed from the documents. Other existing techniques attempt to objectively measure the quality of a document based on more than the content of the web page. For example, in the context of a linked set of documents, one prior technique for measuring quality assigns a degree of importance to a document based on the link structure of the set of documents.
The quality of a search engine may be assessed by humans rating the relevance of the top documents returned by the search engine in response to a query. For a search engine, returning the most relevant documents to the user is of paramount importance. Thus, any improvement to the ability of a search engine to return relevant results is desirable.