The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
Users typically search for web content by using a search engine such as Yahoo or Bing. A search engine is typically configured to continuously browse and index web pages and other web resources that are available online, and to provide an interface that can be used to search the indexed information in response to keywords and other search terms or phrases that are entered by a user in the search engine interface. Typically, a search engine performs the tasks of finding web pages (also commonly referred to as “crawling”), building a search index that supports efficient querying of the content of the crawled web pages, and using the search index to find and return links to web pages that include the keywords and search terms that are entered by a user in the search engine interface.
Web users are becoming more sophisticated in the way they search for information. For example, when searching for very specific information, users typically enter search queries that may include numerous keywords and even entire sentences or paragraphs. In response to such a “long-tail query”, search engines typically return links to thousands and thousands of web pages, and it is up to the user to click on links to large numbers of web pages until the user finds the very specific information she is looking for. The main reason for this less-than optimal response to a long-tail query is that web pages that include all of the terms in the long-tail query often do not exist online and thus are not crawled and indexed by the search engine. Also, for web pages that do contain all the terms of the long-tail query, the content on these pages may not actually be about all the terms of the query.
One approach to provide web content responsive to long-tail queries is to use social networks or groups of part-time human contributors that manually create web content for specific topics. For example, a human contributor may be tasked with writing a web article specifically about inexpensive hotels that are available in the various neighborhoods of New York City. After the human contributor writes the web article, the web article would typically be posted on a website so that it would be crawled and indexed by a search engine.
This human-based approach to providing web content responsive to long-tail queries has many disadvantages principally including cost. Another disadvantage of this human-based approach is that it is time consuming because it may take the human contributor hours or even days to collect the relevant information and to write a web article. Another disadvantage is that this human-based approach cannot possibly produce web content in real time for thousands and thousands of niche topics that may spring up daily and even hourly from the vast quantities of news, events, and other information that is constantly published online. Another disadvantage of this human-based approach is that often a web article created by a human contributor would not be complete or entirely accurate. This is because the human contributor would not be able to collect within a reasonable time frame enough information that is accurate and completely responsive to the topic addressed in the web article. Yet another disadvantage of this human-based approach is that it is difficult or impossible for human editors to refresh or update millions of web pages every time new information becomes available.