1. Field of the Invention
The present invention relates to methods and systems for selecting internet advertisements in response to an internet query, and more particularly, methods and systems for generating expanded search queries to improve the relevance of the advertisements found.
2. Description of the Related Art
Internet advertising methods retrieve ads for use in a sponsored search or content match system. Both types of advertising systems use a similar representation of the ad. That is to say, in both systems an advertisement is represented by a set of keywords, a title, a short description, and a URL, which when clicked takes the user to the advertiser's web page. Typically the user is shown the title, the description and the URL.
Sponsored search presents advertisements in the search results in response to a user's query to a search engine. Typically the ads are short and textual in nature, and appear at the top of the search results or to the side. The keywords are typically matched to the user query, and when an ad is found whose keywords match the user query, the ad is shown to the user.
In a content match system, the ads are placed in a web page based on the content of the web page itself. The system extracts a set of key terms from the web page to represent its content, and then matches the key terms from the web page to the keywords associated with the advertisement. Ads are dynamically placed in a web page as a function of the expected revenue to be generated, and the similarity between the web page and the ad keywords.
Both sponsored search and content match systems rely on sentence retrieval technology to retrieve ad candidates to be shown to the user. In sponsored search the sentence retrieval is in response to a user query. In content match the sentence retrieval is in response to a set of key terms that represent the topic of the web page, but the same retrieval technology can be applied to both systems.
The quality of sentence retrieval in response to a query term depends on numerous factors, such as the number of terms in the query, the specificity of the query, the number of potential meanings for query terms, the quality of the retrieval mechanisms, the amount of time available for retrieval, etc. Some of the applications for sentence retrieval include question answering, result abstracts related to internet URLs (Universal Resource Locator), and selection of advertising based on the provided query.
Online advertising systems operate on short textual descriptions of the ad, typically including a title, a description of one or two sentences in length, a set of keywords, and a search context. For example, the search context can be either a Web page in the case of contextual advertising, or a query in the case of a sponsored search. In this document we refer to ad materials as the set of title, description, and keywords that comprise an Internet advertisement. The term advertisement refers to the subset of these materials that is shown to a user in the search interface.
Pseudo-relevance feedback has been shown to be effective for document retrieval, but it has had mixed results when applied to the retrieval of short texts such as sentences. The term pseudo-relevance feedback is related to relevance feedback, where feedback on the relevance of the results from a first search is given to the system by a user in order to do another search that relates to the documents with the better scores. Some systems use the “more like this” button to implement relevance feedback. Pseudo-relevance feedback relates to simulating relevance feedback by the system before performing another focused search.
However, pseudo-relevance feedback is not very effective when retrieving short texts, such as advertisements. Advertisements are sensitive to expansion because the term frequency distribution is relatively flat, and even a small number of noisy expansion terms may be completely off the intended topic for the original query.
It is in this context that embodiments of the invention arise.