Data on one or more computer systems may contain data useful for a user. However, the data may be too large for the user to find the data by direct examination of the data. Additionally, some parts of the data repository may contain information that is not accessible to the user. In many cases, in order to allow the user useful access to the data, a search mechanism is provided. The search mechanism allows a user to issue a search request (also termed a search query). The results are then returned for the user.
For example, a web-based search engine is a search mechanism which may be used to provide search access to information via a web-based search. The information may be a specific data repository, such as a database or other data collection. The information may also be an agglomeration of a number of different data repositories. Such a search engine may provide search access to information available from different information providers over a network, such as the Internet.
In a typical usage of a web search engine, the user enters a query, which is a set of search terms related to the type of content for which the user is searching. The query is transmitted to the search engine, which attempts to locate “hits”—i.e., content that is available on the Internet and that relates to the terms contained in the query. Generally, the search engine either has a database of web pages that are known to exist, or communicates with external “providers” who maintain such databases; the query is “scored” against items in these databases to identify the web pages that best match the query. A list of results is then generated, and these results are returned to the user's computer for display by the user's web browser.
Typically, the databases contain information such as: the Uniform Resource Locators (URLs) of web pages, the titles of the pages, descriptions of the pages, and possibly other textual information about the web pages. The user then reads the results and attempts to determine, based on the text contained in the results, whether the results correspond to what the user is looking for. Users may then attempt to retrieve the entire page correlating to a search result. In other contexts, search engines present results summarizing the pieces of data which may possibly be useful for a user.
The utility of the search engine is directly correlated to the quality of the results provided. In the best case, the most results presented to the user are presented in order of utility to the user on the result page.
Because the quality of the results is subjective, in order to determine what the quality of results are, the user's satisfaction must be determined. For example, a user can be allowed to use a search engine for a period of time and then orally interviewed by an interviewer to determine the user's satisfaction.
In the prior art, quality of individual web pages has been measured by obtaining explicit feedback from a user. At least one prior art web browser has attempted to obtain such explicit feedback from a user. This browser is described in a paper entitled “Inferring User Interest” by Mark Claypool, David Brown, Phong Le, Makoto Waseda in IEEE Internet Computing 5(6): 32-39 (2001). In this browser, different pages are displayed by the browser. Whenever the page being displayed by the browser is changed, a user evaluation of the page is requested from the user. User evaluations for a given page are collected, to determine whether users find that page valuable. In this browser, some implicit feedback is also maintained regarding each page, including data regarding the time spent on the page, mouse movements, mouse clicks, and scrolling time.
While this technique does gather user feedback, it has limited utility in situations in which users may have different needs for a page. For example, a user who is looking for information about books written by Douglas Adams may evaluate a page on his book The Hitchhiker's Guide to the Galaxy and give a high score for utility. However, another user who is looking for information on books about traveling cheaply may evaluate the same page and give it a low score. Thus the technique described will have limited utility in the wide variety of situations in which different users may have different needs, or even where a single user may have different needs for information at different times. In other words, the usefulness of this technique is limited because evaluation of each page is completely independent of the context in which the user arrived at the page and in order to ultimately improve the search process and provide more relevant data proper diagnostics based on the user behavior in context must be performed.
Thus, this technique has a limited value for evaluating the quality of a search engine and is hardly acting on improving the search engine or quantity and quality of the underlying content being searched for.
In view of the foregoing, there is a need for a system and method that overcomes the drawbacks of the prior art.