The World Wide Web (“Web”) provides a large collection of interlinked content items in various formats, including documents, images, video and other media content. As the Web has grown, the ability of users to search this collection and identify content items relevant or responsive to a given query has become increasingly difficult, with a number of search service providers existing to meet this need. In general, a search provider publishes a web page via which a user may submit a query indicating terms in which the user is expressing an interest. In response to the query, the search service generates and transmits to the user a list of links to Web pages or locations of content items that are relevant to the query, typically in the form of a search results page.
Existing query response methods generally involve the following steps. First, an index or database of word/location pairs is searched-using one or more search terms extracted from the query to generate a list of hits (usually target pages or sites, or references to target pages or sites, that contain the search terms or are otherwise identified as being relevant to the query). The hits are ranked according to ranking criteria, with better results (according to the criteria) given more prominent placement, e.g., towards the top of the list. The ranked list of hits is transmitted to the user, usually in the form of a results page containing a list of links to the hit pages or sites.
Ranking of hits is often an important factor in whether a user's search ends in success or frustration. Frequently, a query returns such a large number of hits that it is impossible for a user to explore all of the hits in a reasonable time. If the first few links a user follows fail to lead to relevant content, the user may often give up on the search and possibly on the search service provider, even though relevant content might have been available farther down the ranked list of hits.
To maximize the likelihood that relevant content is prominently placed, search service providers have attempted to develop ranking criteria and algorithms. Such ranking criteria may utilize the number of occurrences or the proximity of search terms on a given web page or document. Similarly, existing algorithms may examine the placement of search terms in a given web page or document for use in ranking content items in a result set. While existing algorithms may determine the frequency of search terms in a given document or the placement of search terms, these algorithms fail to take into account the context of the search terms in a given document.
In order to overcome shortcomings associated with existing techniques for identifying and ranking content items in response to a search query, embodiments of the present invention provide systems and methods for determining the context of one or more terms associated with an item of content to identify the most relevant items of content responsive to a given search query.