A. Field of the Invention
The present invention relates generally to information retrieval and, more particularly, to automated techniques for judging the quality of a document.
B. Description of Related Art
The World Wide Web (“web”) contains a vast amount of information. Search engines assist users in locating desired portions of this information by cataloging web pages. Typically, in response to a user's request, the search engine returns references to documents relevant to the request.
Search engines may base their determination of the user's interest on search terms (called a search query) entered by the user. The goal of the search engine is to identify links to high quality relevant results based on the search query. Typically, the search engine accomplishes this by matching the terms in the search query to a corpus of pre-stored web documents. Web documents that contain the user's search terms are considered “hits” and are returned to the user.
It may be desirable to sort the hits returned by the search engine based on some objective measure of the quality of the hits. Determining an appropriate quality metric for a document such as a web page can be a difficult task. For one thing, the quality of a web page to the user is inherently subjective and depends on the user's interests, knowledge, and attitudes. There is, however, much that can be determined objectively about the relative quality of a web page. One technique for determining the quality of a web page is based on more than the content of the web page. This technique assigns a degree of quality to a web page based on the link structure of the web.
The ability to automatically assign a degree of quality to documents, such as web pages, is an important one that can be used to effectively implement a number of technologies, such as search engines. Accordingly, it would be desirable to improve the assigned quality metrics.