The present invention relates in general to searching a corpus of documents, and in particular to search systems and methods with integration of trusted user judgments.
The World Wide Web (Web) provides a large collection of interlinked information sources (in various formats including documents, images, and media content) relating to virtually every subject imaginable. As the Web has grown, the ability of users to search this collection and identify content relevant to a particular subject has become increasingly important, and a number of search service providers now exist to meet this need. In general, a search service provider publishes a web page at which a user can enter a query indicating what the user is interested in. In response to the query, the provider generates and transmits to the user a list of links to Web pages or sites considered relevant to that query, typically in the form of a “search results” page.
Query response generally involves the following steps. First, a pre-created index or database of Web pages or sites is searched using one or more keywords from the query to generate a list of hits (usually references to pages or sites that contain the keywords or are otherwise identified as being relevant to the query). Next, the hits are ranked according to predefined criteria, and the best results (according to these criteria) can be given the most prominent placement, e.g., at the top of the list. The ranked list of hits is transmitted to the user, usually in the form of a “results” page (or set of interconnected pages) containing a list of links to the hit pages or sites. Other features, such as sponsored links or advertisements, may also be included.
Ranking of hits is an important factor in whether a user's search ends in success or frustration. Frequently, a query will return such a large number of hits that it is impossible for a user to explore all of the hits in a reasonable time. If the first few links a user clicks through fail to lead to relevant content, the user will often give up on the search and possibly on the search service provider, even though relevant content was available farther down the list.
To maximize the likelihood that relevant content will be prominently placed, search service providers have developed increasingly sophisticated page ranking criteria and algorithms. In the early days of Web search, rankings were usually based on number of occurrences and/or proximity of search terms on a given page. This proved inadequate, and algorithms in use today typically consider various other information, such as the number of other sites on the Web that link to a given target page (which reflects how useful other content providers think the target page is), in addition to the presence of search terms. One algorithm allows users who enter a particular query to provide feedback by rating the hits that are returned. Such ratings are stored in association with the query, and previous positive ratings are used as a factor in ranking hits the next time that query is entered.
Even with the most sophisticated ranking algorithms, searches may still fail to return relevant content or to rank such content highly enough that the user can readily find it. In such instances, the user generally receives little guidance toward improving the results, which only adds to the user's frustration. For example, users often know what sort of information they are trying to find and may even find one or two relevant hits in a search, but they cannot readily determine how to modify the query to increase the number of relevant results.
Thus, it would be desirable to provide search services with feedback features to enhance the likelihood of returning relevant content to each user.