The Internet, including the World Wide Web, contains a vast amount of information that can be accessed through use of a web search engine returning results in response to a search query. A keyword search can instantly return thousands of web pages relevant to the search terms. However, there is room for improvement in how to best display the results, especially when the results are numerous.
To be effective, the web search engine must effectively identify content, capturing relevant web pages and discarding irrelevant web pages. However, there is often a gap between what users hope to find and the actual results returned from the web search engine. In particular, broad search queries may return too large a number of results for a human user to effectively process. This is known as an abundance problem. The order of returned results, or the ranking, may mistakenly indicate a higher relevance for web pages irrelevant to the user and a lower relevance for the web pages than is actually relevant to the user.
Another problem is that a keyword used as a search query may cause the web search engine to return web pages about unrelated content that shares the keyword. Consider, for example, the keyword “jaguar.” The search engine may return results related to luxury cars, a large cat, and an Atari computer system. All the results may be jumbled together in a single list, ordered not by similarity of content, but rather ordered based on a keyword-driven search algorithm. Even though a human user can easily distinguish between a web page discussing Jaguar cars and another web page discussing endangered wild jaguars, that distinguishing is a difficult task for the search engine running a keyword search algorithm.
A further difficulty is accounting for differences between the subjective interpretation of content on a web page and the keywords found on that web page. For example, a web page of Jaguar Cars Ltd. is a web page of an automobile manufacture, but the keywords “automobile manufacture” may be entirely absent from the Jaguar Cars Ltd. web page, and thus, not return that web page as a search result. Even when a web page sought by users contains the keyword, that web page may have a low ranking in the search results. For example, a user searching for “Harvard” would expect that www.harvard.edu is returned as one of the most relevant search results. However, other web pages may use the term “Harvard” more frequently, more prominently, or in some other way as to receive a higher relevance ranking. Ultimately the notion of relevance depends on human judgment and is difficult to capture in any search algorithm.
In a body of information such as the World Wide Web, hyperlinks between web pages are also available to assist users with categorizing and evaluating web pages. A web page that is hyperlinked to by many other web pages (i.e. incoming hyperlinks) may be though of as an “authority.” By virtue of many other web pages linking to an “authority” web page that web page is likely to have content relevant to a same topic as the other web pages. Conversely, a web page that has many hyperlinks going to other web pages (i.e. outgoing hyperlinks) may be thought of as a “hub.” A “hub” web page may be a list of bookmarks, a directory page, or the like. A large number of other web pages relevant to the same topic can be found from starting at a “hub” web page.
Therefore, it is desirable to find ways to generate search results that allow users to efficiently understand relationships between a large number of search results, distinguish between unrelated content that shares a keyword, and perceive “authority” and “hub” relationships among web pages in a list of search results.