As the amount of content, such as documents, images, videos and sound files, proliferates on the Internet, users have begun to rely more heavily on Internet search engines to locate and view content in which they are interested. One example of a search engine is a computer program designed to find documents stored in a computer system, such as the World Wide Web. The search engine's tasks typically include finding documents, analyzing documents, and building an index that supports efficient document retrieval.
A user describes the documents she is seeking with a query. In a common case, a query is a set of words, which should appear in the documents. Internet search engines offer the capability to search for links to content on the Internet that is deemed relevant to a search query, such as web pages and multimedia, among other categories. In response to a query, the web site performing the search query may display content extracted from other web sites in addition to links to content.
Query logs are a collection of user-submitted queries over a period of time. The collection may be supplemented by additional data, such as cookies, search results, click-through data, and other data. Each document returned by the search engine in response to the user's query is a result. A search results page is a web page that displays the result documents' web addresses along with titles, summaries, thumbnail images, and/or other information. A document's rank for a given query is the position in which the document appears on the search results page. A document's rank indicates that the search engine evaluated it more relevant to the user's query than lower-ranked documents.
One problem faced by search engines and their users is that certain queries have an unstated, inherent context that influences what set of results are the best; for example, the best results for particular queries may have a date component that is not present in the query. An example is: in November 2006, a user searches for the query “Olympics.” The best and most relevant results depend on what year Olympics the user is looking for. Is the user looking for the 2004 Summer Olympics in Athens, the 2006 Winter Olympics in Turin, the 2008 Summer Olympics in China, or some other Olympics? These may be considered “time-sensitive” queries, which are queries with an implicit time component. Often, time-sensitive queries state a date explicitly, but that is not always the case. In the above example, “Olympics” is a time-sensitive query, but the date is assumed implicitly.
The search engine faces two problems when attempting to deal with queries with implicit context, such as a time-sensitive query. First, the search engine needs to identify which queries have this context, such as an implicit date component. Second, the search engine must identify which documents best relate to the query and respond to the query's implicit context.
Current approaches to identifying and responding to these particular queries are to take the text in a user's query, match that text to a property of a document indexed by the search engine, rank the results based upon various criteria such as number of times the search term appears in the document, how many other web pages link to the document returned in response to the query, or to rank the results procured by date of creation or modification of the particular web page result.
These approaches are inadequate for several reasons. The approaches do not offer a technique for identifying particular queries that may have an implicit context; for example, a query without a date where a date may be implied. Further, the approaches for ranking documents, such as sorting by date of creation or modification, do not specifically address the situation where the most relevant document may not be the most recently added or modified document.
Therefore, an approach for identifying searches with an implicit context and returning and ranking results in response to such queries, which does not experience the disadvantages of the above approaches, is desirable. The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.