1. Field of the Invention
The present invention relates to the field of Web page searches and, more particularly, improving search results using a browsing-time relevancy factor.
2. Description of the Related Art
Large knowledge stores, such as a corporate intranet or the Internet, contain such a vast amount of information that finding accurate information in a timely manner is a daunting task. A variety of search engines exist that promise to provide users with fast and, more importantly, accurate results for their queries. However, the methods used by conventional search engines to determine the ordering of search results in terms of relevance to a user's query can be externally manipulated. This manipulation decreases the capability of conventional search engines to provide accurate results, increases the time spent by users to identify relative information, and increases a user's sense of frustration.
For example, the presence of the search terms or query criteria in the text of a Web page typically implies that the Web page is relevant to the user's query. Therefore, the more times the query criteria appears on a Web page, the more relevance the page is to the query and the higher it should appear in the results list. This is a part of the simple relevancy logic of many early search engines. Once Web page authors realized that key word repetition in page content increased a page's ranking, the validity of this relevancy factor diminished. By repeating words “invisibly”, white text on a white background, a Web page author could increase a page's ranking without providing additional relevance or information. Similarly, unrelated words can be added to a Web page so that the page is included in the results of more popular search terms.
Conventional search engines have attempted to overcome the flagrant abuses of their ranking algorithms. Many popular search engines use a variety of factors and weightings to determine the relevancy of a Web page to the entered query criteria. Despite these improvements, the results provided are still skewed. For example, many search engines sell organizations the ability to increase the ranking of Web pages from a designated Web site.
Further, many of the factors used by conventional search engines to combat abuse inject additional biases. For example, the age of a Web page or Web site is often used to determine a sense of legitimacy. This factor, meant to contend with influxes of fad and fictitious Web sites, precludes relevant content simply because its host is newer than the existing majority. Another factor uses a frequency with which a Web page is accessed, under the assumption that popular Web pages are more relevant to users than less popular ones. This popularity factor is abused by automated systems that iteratively access a Web page, with the explicit purpose of artificially inflating a popularity factor associated with the page.