This specification relates to digital information retrieval, and particularly to search processing.
The Internet enables access to a wide variety of resources, such as video or audio files, web pages for particular subjects, book articles, or news articles. A search engine can identify resources in response to a user query that includes one or more search terms or phrases. The search engine ranks the resources based on their relevance to the query and importance and provides search results that link to the identified resources, and orders the search results according to the rank.
Many websites for which data available in resources store the data in large databases of structured information. For example, job search websites may have respective job databases, and respective resources (web pages) that include forms to search the databases. Likewise, recipe websites have respective databases for recipes, and movie websites have respective databases for movies. Requesting information for a certain recipe or movie causes the website to query its respective database and generate a webpage that presents the information in a structured format.
Often, however, search engines do not account for particular database search capabilities when ranking resources in response to particular queries. Thus, a website may have particular pages for particular entries in the database so that each page can be crawled and searched by the search engine. For example, an airline flight website may have pre-generated pages for a variety of very popular and well-traveled routes (e.g., New York to San Francisco, Chicago to Los Angeles, etc.). However, this practice tends to artificially increase recall and reduce precision. Furthermore, the underlying search capabilities of the database may prove to be very useful in satisfying a user's informational need. However, many scoring algorithms do not score the search capabilities of a database when determining the relevance of a resource generated from data stored in the database. As a result, the search engine may not identify data that are particularly relevant to a query, and/or identify particular search capabilities that are available to the user that issued the query and that may help the user satisfy his or her informational need.