The World Wide Web has given computer users on the internet access to vast amounts of information in the form of billions of Web pages. Each of these pages can be accessed directly by a user typing the URL (universal resource locator) of a web page into a web browser on the user's computer, but often a person is more likely to access a website by finding it with the use of a search engine. A search engine allows a user to input a search query made up of words or terms that a user than will be used in the web pages containing the information he or she is looking for. The search engine will attempt to match web pages to the search terms in the search query and will then return the located web pages to the user.
The search results generated from a user's search query typically consist of a collection of document surrogates, each of which contains summary information, attributes, and other meta-data about the matched documents. These document surrogates are often present in a simple list-based format, displaying the title of the document, a snippet containing the query terms in context, and the uniform resource locator (the URL). A user can then select one of the returned entries to view the corresponding web page.
With the continued growth Of web pages available on the internet making the task of search engines more and more difficult, web search engines have greatly increased the size of their indexes and made significant advances in the algorithms used to match a user's search query to these indexes. However, while it is clear that significant effort has gone into creating web search engines that can index billion of documents and return the search results in a fraction of a second, this has resulted in the creation of the problem of search queries returning more results than the user can easily consider.
While many relevant documents might be present in the search results returned from a search engine, often the returned search results consist of tens or hundreds of individual documents making it hard for a user to determine which of the search results may or may not be relevant to the information the user is looking for.
While information retrieval techniques used by web search engines have improved substantially over the years, the search results are still typically represented in a simple list-based format. Although this list-based representation makes it easy to evaluate a single document, it does not support the users in the broader tasks of manipulating the search results, comparing documents, or finding a set of relevant documents. Even though this simple list-based representation provides the search results in a clear and effective manner for determining the relevance of individual document surrogates, it requires that each document surrogate be evaluated in turn, and to some degree, in the order provided. If hundreds of documents are returned, it is inefficient if not completely impractical to have a user review all of these results to determine the most relevant documents located in the search. Requiring users to evaluate each document surrogate individually, often with only ten documents per page, leads to a common user search trait of evaluating only a few pages of search results before either re-formulating their query or giving up.
One solution that can be used to address these numerous search results is for the user to reformulate his or her search query to narrow the search with the result that fewer document are located matching the search query. However, studies have shown that users seldom reformulate their queries, even when a poor set of search results are provided. In many cases there may be high quality relevant documents buried in the search results set that were missed because the users did not look at enough search result pages.