Today's search engines follow a decade old paradigm in presenting search results to a user. In response to a user query, typically expressed in the form of a few keywords and often times just one or two words, current search engines use a proprietary ranking algorithm to return documents deemed most relevant to the query. The factors that go into the computation of the relevance of a page include the authoritativeness of other pages on the web pointing to the page under consideration and the number of people accessing the page (via clicks) to name a few.
A key problem in the above paradigm is that the meaning of keywords used for expressing a query is often ambiguous. It is thus difficult for the search engine to correctly ‘guess’ user intent and return results that satisfy the actual intent of the specific user asking the query. For example, given the query flash, different users may be looking for very different information when they ask this query. A first user may be looking for the Adobe Flash player, while a second might may be interested in information about Flash Gordon, the adventure hero, and a third user may be interested in the location Flash, which happens to be the village with the highest elevation in England. In general, a very large number of queries, particularly the short, popular ones, belong to multiple categories of information and have multiple interpretations.
Current engines do not consider multiple possible intents of a query when presenting the search results. Consider again the query flash. In a recent sampling conducted by the inventors hereof, the first result page for this query on live contained eight documents related to Adobe Flash, one related to camera flash, and one related to the band, Grandmaster Flash. Similarly on Google, the first result page contained seven documents related to Adobe Flash, one related to the Stanford Flash project, one related to home security system, and one related to an online music store. Clearly, the first user described above would be satisfied with these search results, but the second and third would not.