Through the Internet and other networks, users have gained access to large amounts of information distributed over a large number of computers. In order to access the vast amounts of information, users typically implement a user browser to access a search engine. The search engine responds to an input user query by returning one or more sources of information available over the Internet or other network.
The search engine typically performs two functions including (1) finding matching documents and (2) scoring the matching documents to determine a display order. The search engines typically order or rank the results based on the similarity between the terms found in the accessed information sources to the terms input by the user. Results that show identical words and word order with the request input by the user are given a high rank and will be placed near the top of the list presented to the user.
Scoring performed by different search engines takes into account various factors including whether a match was found in the title, the importance of the match, the importance of a phrase match, and other factors determined by the search engine. Parameters that work well for one kind of search may not work well for all searches and parameters that work for some users may not work well for others.
A problem with most currently known techniques is the failure to consider user input in determining the relevance of search results. Users are unable to ensure that the results will be output in an appropriate order of relevance. Optimizing search result ranking is difficult for many reasons, one of them being the difficulty inherent in accurately and cost effectively generating testing and training data with which to measure and train new ranking algorithms. The user base of searchers will generally be the best source for high quantity testing and training data. However, requests to end users to provide more testing and training data have been met with limited success.
The limited success stems from the fact that providing feedback is often cumbersome and time consuming for users. Furthermore, pre-configured feedback formats are often inadequate. For example, a user can report to a search system that for the query “foo”, search result number 3 is not a relevant result, but more context is often needed to make this feedback useful. For example, the search system may need to determine how the evaluated result compares with the remaining results presented. Furthermore, the ranking components of the search system may want to ascertain whether the results were ordered correctly and how the ordering could be improved. Finally, the search system needs to know if a useful result failed to appear or if any results that were produced were useless.
One currently available feedback system involves a highly controlled testing environment in which paid search result judges create ideal result sets including 10 to 100 search results for a single query. The paid search result judges group the results into relevance categories. The relevance categories may be identified by labels such as “perfect”, “excellent”, “good”, “poor”, etc. However, the tools the judges use for this data collection are not user friendly and the task is too laborious to expect search system users to provide this data without compensation.
User satisfaction is a critical success factor for a search engine. Accordingly, a solution is needed that fully considers user input regarding search engine performance and results. A user-friendly solution to enable end-users to easily create ideal result sets would provide a significant advantage.