The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
When a polysemous word is submitted to a query engine, under current approaches, the query engine will return search results linking to documents associated with all the meanings of the polysemous word. The user is left to rummage through the search results to locate the type of documents relating to the intended meaning of the words in his search.
For example, suppose the word “fencing” is queried. Fencing is either a sport, a structure to delineate a land boundary, or the act of selling stolen goods. In prior searching approaches, hyperlinks to web pages relating any or all of the three meanings would be returned to the user.
In an approach, other words that are frequently submitted with the target word are suggested to the user to narrow the search. These query extensions are determined from analyzing past query data. For example, a user who once submitted the query “fencing” and desired results relating to “fencing” as a sport may have been dissatisfied with broad search results. Such a user would submit a follow-up query, “fencing epée,” in order to narrow the results returned by the search engine. If this search pattern is repeated over many submissions, the second query, “fencing epée,” becomes strongly associated with the first query, “fencing,” and will be returned to the user as a suggested narrowing query.
Because a search engine may require six months' collection, or more, of query submission data for such correlations to propagate through the search engine, it is not desirable to detect and classify correlations of polysemous words by accumulating query data from real user queries.
Advertisers who target advertisements to users depending on the terms used in a particular query also encounter problems with polysemous words. In a past approach, advertisements provided to a user may not correlate with the interests of the user because a query consisted of a polysemous word. An inappropriate advertisement would displace appropriate ones. In the prior approach, in order to ensure that an advertisement was presented to the correct audience, advertisers needed to specify particular conjunctive keywords in queries that trigger the display of an advertisement. A supplier of sport fencing goods would explicitly specify, for example, “fencing epée,” “fencing sabre,” “fencing foil,” and “fencing tournament” as the queries which would trigger a display of fencing advertisement.
However, advertisers are not able to predict all the variations of queries submitted by users who may be interested in the advertisers' goods, and would thereby miss key opportunities to display advertising to an ideal audience. For example, when fencer Mariel Zagunis won the gold medal in the 2004 Olympic Games, an event that may have created an overnight surge of queries on her name, it would have been desirable for sport fencing advertisement to be displayed in conjunction with the queries having her name together with the word “fencing.” In a previous approach, sport fencing advertisers would have needed to add “Mariel” and “Zagunis” to a search engine's keyword list in order for advertisements to be displayed in response to a query of the name. Such manual tracking of keywords is time-consuming and should be avoided. Based on the foregoing, there is great need to be able to automatically and quickly update a search engine with newly correlated words of polysemous words.