The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
The ultimate goal of any search engine is returning relevant results in response to a query. Generally, upon receiving a query from a user, a search engine searches its index for documents that match the query and then employs a ranking function to order the list of matching documents before presenting the list to the user. How a search engine ranking function decides which documents are the best matches, and what order the results should be shown in, varies widely from one search engine to another. Indeed, tuning a search engine ranking function to return relevant results for every search query is a challenging task. The task is especially challenging for Internet search engines as the numbers of documents and users are large and continuously increasing over time.
As search engines rely upon ranking functions to order results presented to the user, many techniques have been proposed to improve the quality of search engine ranking functions. Recently, however, there has been more attention on the automatic evaluation of the quality of search engine search results. Techniques for automatic evaluation enable, for example, search engines to process difficult queries in different ways, or to ask the user for further disambiguation information.
One approach for assessing search result quality is disclosed in the paper entitled “Web Projections: Learning from Contextual Subgraphs of the Web”, by J. Leskovec, S. Dumais, and E. Horvitz, which is published on the Internet at the website for the Sixteenth International World Wide Web Conference, held in Banff, Alberta, Canada, from May 8-12, 2007. The Leskovec, et al. approach uses a supervised learning approach. In a supervised learning approach, a prediction is made as to the quality of search results based on training data. The training data typically consists of pairs of search queries and search results where each pair is labeled with a human-assigned relevance score. The task then, under the supervised learning approach, is to predict, accurately, the quality of search results for a valid search query after having been provided with a number of training examples.
While the supervised learning approach can provide an accurate predication as to the quality of search results, the supervised learning approach suffers from a drawback. Namely, supervised learning requires the creation and labeling of training data. For example, a system in which a typical supervised learning approach is used might require training data comprising thousands of queries, each with a corresponding set of search results and each labeled with a human-assigned relevance score. In the context of Internet search engines, the task of maintaining training data is especially demanding because the numbers of documents and queries are large and continuously increasing over time. Therefore, there is a need to quantify, accurately, the quality of search results without having to maintain training data as required by a supervised learning approach.
Another approach for assessing search result quality is disclosed in the paper entitled “Predicting Query Performance”, by S. Cronen-Townsend, Y. Zhou, and W. B. Croft, which is published on the Internet at the website for the Twenty Fifth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, held in Tampere, Finland, from Aug. 11-15, 2002. Cronen-Townsend et al. employ an unsupervised learning approach. An unsupervised learning approach does not suffer from the drawback of a supervised learning approach in that the unsupervised learning approach does not require the creation and labeling of training data. However, previous unsupervised learning approaches for assessing search result quality required access to the whole document collection that is searchable by the search engine. For example, the Cronen-Townsend et al. approach infers the quality of results by calculating the divergence of the language model of the top ranked documents from the language model of the whole document collection. The task of calculating is especially challenging in the context of Internet search engines where the size of the whole document collection is large and continuously increasing. Therefore, there is a need to assess the quality of search results that does not require access to the entire document collection.
One application of techniques for assessing search result quality is query routing in distributed search systems. Query routing consists of selecting the best search engines that are able to respond to a particular query. In query routing, a broker decides which search engine to send the query to among multiple search engines that may respond to the query. Previous approaches to query routing ranked search engines based on statistics of the terms contained in the query.
For example, one approach for query routing that uses a statistics database, known in the art as CORI, is disclosed in the paper entitled “Searching distributed collections with inference networks”, by J. P. Callan, Z. Lu, and W. B. Croft, which was presented at the Eighteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, held in Seattle, Wash., from Jul. 9-13, 1995. For a given query, CORI ranks the source search engines based on df.icf for document collections, where df represents the document frequency in a particular search engine's document collection and icf the inverse of the frequency of the term in all collections.
Another approach that uses term statistics in the document collections is disclosed in the paper entitled “Generalizing Gloss to Vector-Space Databases and Broker Hierarchies”, by L. Gravano and H. Garcia Molina, which was presented at the Twenty First International Conference on Very Large Data Bases, from Sep. 11-15, 1995, held in Zurich, Switzerland. Gloss uses term statistics in the collection to infer the rank of sources for a given query. Gloss requires two vectors to estimate the rank: the document frequency of the terms in each collection and the sum of weight of each term over all documents in the collection. Based on these vectors, Gloss proposes two estimators to predict the number of documents in a document collection having a similarity with a query greater than a threshold l. One estimator, Max(l), assumes the highest level of co-occurrence of the query terms in the database documents. The second estimator, Sum(l), assumes the terms in the query do not appear together in any database document.
Both the CORI approach and the Gloss approach do not take into account the search result quality when selecting sources. Source search engines with similar statistics about the query terms have the same importance to the given query. However, as the search engines can employ different ranking functions, the quality of search results can be different. Therefore, there is a need for better source selection for search queries that takes into account the quality of the ranking functions of each search engine.
Another application of techniques for assessing search result quality is the aggregation of search results by a meta-search engine. A meta-search engine or search engine aggregator is a type of search engine that submits user search queries to other search engines and returns the results or portions thereof from each of the other search engines. Meta-search engines enable a user to enter and submit a query once and obtain results from many information sources simultaneously. Current approaches for aggregating search results from multiple search engines do not take into account the quality of the ranking function of each search engine.
Based on the foregoing, it is clearly desirable to quantify the quality of search results without requiring creation and labeling of training data and without requiring access to the whole searchable document collection. It is further desirable to route queries in a manner that takes into account the quality of search results to select source search engines. Finally, it is desirable to aggregate search results from multiple search engines in a manner that takes into account the quality of the ranking functions employed by the search engines.