Due to advances in computer and networking technology, the amount and variety of information that can be accessed through a computer continues to increase at an astounding rate. The Internet, in particular, has enabled computer users to access a wide variety of information from computers located all over the world.
As the amount of information that can be accessed by a computer user increases, however, it becomes more and more difficult to sift through the available information to locate useful information. To address this concern, a considerable amount of effort has been directed towards improvements to search engines, which are generally computer programs that are used to access databases of information in response to queries submitted by users.
While search engines are commonly used to access a wide variety of databases, a predominant application of search engines is in accessing information from the Internet. For example, a search engine is often used to access directory services to identify documents that contain information about particular topics, similar in many respects to printed telephone directories. With directory services, documents are typically classified by topic, with the addresses of those documents, as well as basic summaries thereof, stored in records that are searchable by the search engine.
Search engines are also often used to access indexing services that attempt to catalog as many documents as possible from the Internet. Most indexing services typically construct databases of document records by “crawling” from document to document on the Internet, reading each document and cataloging important terms and words therefrom, and following the links provided in each document to locate additional documents.
With the amount of information available on the Internet increasing at an exponential rate, search engines continue to locate a greater number of documents matching a particular search request. As the number of located documents increases, the order in which those documents are presented to a user, also referred to as the “ranking” of the documents, becomes more important, as a user will typically look at the documents identified at the top of a list of search results before looking at documents identified later in the results.
Early search engines typically relied on generally rudimentary retrieval algorithms that ranked the results of queries based upon factors such as the number of search terms that were found in each document, the number of occurrences of each search term in each document, the proximity of search terms in each document, and/or the location of search terms in each document (e.g., giving greater weight to search terms being at the top, or in a title or heading, or a document). However, it has been found that ranking results purely by the placement and frequency of search terms often leads to poor rankings. As one example, some conventional search engines can be manipulated by document authors through a process known as “spamming”, where search terms are inserted into documents in non-visible portions thereof for no other purpose but to increase relative rankings of the documents given by search engines.
To address such concerns, some conventional search engines rely on additional information to rank results. For example, the search engines for some indexing services weight documents more heavily based upon whether the documents are also listed in associated directory services. Other search engines use “link popularity” to rank results, granting higher rankings to documents that are linked to by other documents.
An additional type of information that may be used in ranking search results is based upon user interaction with documents. For example, it is possible with some search engines to monitor the amount of time that a user spends viewing particular documents identified in a set of search results and increasing the ranks of documents that have been viewed for longer times, based upon the premise that a user will spend more time viewing a more relevant document than viewing a less relevant document. However, the duration that a user spends viewing a document can also be dependent upon factors other than relevancy, e.g., if a document is large and the user has to spend a relatively long amount of time to determine that the document is not relevant. As a consequence, the duration that a user spends viewing a document may have only marginal applicability to the relevance of a document in certain instances.
While the above-described enhancements to conventional search engines have been successful to an extent in providing users with more relevant search results, a significant need continues to exist for further improvements in the manner in which search results are ordered and returned to users. In particular, it is believed that additional gains in the relevancy and usability of the results returned by search engines may be obtained through further reliance on the interaction of users with particular documents in the ordering of search results.