In a typical web search, a computer user submits a search query to a search engine by inputting a string of text that describes a concept of interest using keywords. In response to the search query, the search engine parses the text, identifies relevant web pages, and presents the user with a search engine results page in the form of a listing of hyperlinks to relevant web pages and a short description of content that matches the keywords.
Algorithms such as page ranking have been employed in web search to determine the importance or relevance of a web page and present the most relevant hyperlinks in a search results page. The ranking of a particular web page may be based on metrics such as: number of hyperlinks to the web page, number of queries that result in the hyperlink to the web page, and popularity of the web page measured by the click-through rate of the hyperlink when selected from search results.
A search engine may employ techniques such as query expansion to address alternate word forms used to describe a concept of interest in a query. In general, query expansion reformulates a seed query using synonyms and variants to match additional web pages. While query expansion allows a user to submit a search query using alternate word forms to describe a concept of interest, query expansion typically provides search results that are less precise and that include hyperlinks to web pages beyond the concept of interest sought by the user. In addition, when a search query includes terms that refer to multiple concepts, the search results returned by a search engine may include a hyperlink that is related to the concept of interest sought by the user somewhere in the listing but will also include hyperlinks to unrelated content.
Approaches have been taken to resolve ambiguous references to entities that appear in the text of unstructured documents such as web pages. Named entity normalization, which is also termed named entity disambiguation, attempts to determine the unambiguous identity of a named entity that appears in text. The goal of named entity normalization is to link different surface forms of the named entity to a single corresponding referent that is identified by a canonical name in a knowledge base. As an example, attempts have been made to link mentions of named entities in web content to titles in the Wikipedia® Internet encyclopedia.
When desiring a scoped or specific result, a user may avoid an expansive web search and instead query a web service in a particular domain related to a concept of interest. For example, a user may query a particular web service that provides on-demand streaming movies when the user desires a result such as purchasing or viewing a specific movie.
Web services that focus on particular domains typically employ structured databases or catalogs that categorize and address a limited number of items which can be served to users. Such web services may allow users to browse items in a category, such as a selected movie genre, or submit a query for the name of an item, such as a particular movie title. When querying such a web service, however, a user may be limited to submitting query terms that comply with the categorization and addressing employed by the catalog structure in order to receive a desired result. In other words, web services that employ structured catalogs to index available items are not responsive to many of the numerous surface forms that can be used to describe an item. In the foregoing example, for instance, a query will not successfully retrieve a specific movie desired by the user if the query is not limited to words appearing in the title of the movie.