The global Internet has become a mass media on par with radio and television. And just like radio and television content, the content on the Internet is largely supported by advertising dollars. The main advertising supported portion of the Internet is the “World Wide Web” that displays HyperText Mark-Up Language (HTML) documents distributed using the HyperText Transport Protocol (HTTP).
Two of the most common types of advertisements on the World Wide Web portion of the Internet are banner advertisements and text link advertisements. Banner advertisements are generally images or animations that are displayed within an Internet. Web page. Text link advertisements are generally short segments of text that are linked to the advertiser's Web site.
As with any advertising-supported business model, there needs to be some metrics for assigning monetary value to advertising on the World Wide Web. Radio stations and television stations use listener and viewer ratings services that assess how many people are listening to a particular radio program or watching a particular television program in order to assign a monetary value to advertising on that particular program. Radio and television programs with more listeners or watchers are assigned larger monetary values for advertising since more people get exposed to the advertisement. With Internet banner type advertisements, a similar metric may be used. For example, the metric may be the number of times that a particular Internet banner advertisement is displayed to people browsing various Web sites. Each display of an Internet advertisement to a Web viewer is known as an “impression.”
In contrast to traditional mass media, the Internet allows for interactivity between the media publisher and the media consumer. Thus, when an Internet advertisement is displayed to a Web viewer, the Internet advertisement may include a link that points to another Web site where the Web viewer may obtain additional information about the advertised product or service. Thus, a Web viewer may ‘click’ on an Internet advertisement (place a cursor on the advertisement and then press a button) to be directed to a Web site designated by the advertiser that contains additional information on the advertised product or service. When a Web viewer selects an advertisement, this is known as a ‘click through’ since the Web viewer ‘clicks through’ the advertisement to see the advertiser's designated Web site. Advertising services record every click-through that occurs for an Internet advertisement.
A click-through on an Internet advertisement clearly has value to the advertiser since an interested Web viewer has indicated a desire to see the advertiser's Web site. Thus, an entity wishing to advertise on the Internet may wish to pay for such click-through events instead of paying for displayed Internet advertisements. Many Internet advertising services have therefore been offering Internet advertising wherein advertisers only pay for Web viewers that click on the Web based advertisements. This type of advertising model is often referred to as the “pay-per-click” advertising model since the advertisers only pay when a Web viewer clicks on an advertisement.
With such pay-per-click advertising models, Internet advertising services must display advertisements that are most likely to capture the interest of the Web viewer to maximize the advertising fees that may be charged. In order to achieve this goal, it would be desirable to be able to select Internet advertisements that most closely match the context that the advertising is being displayed within. In other words, the Internet selected advertisement should be relevant to the surrounding content on the Web site. Thus, advertisements are often placed in contexts that match the product at a topical level. For example, an advertisement for running shoes may be placed on a sport news page. Information retrieval systems have been designed to capture simple versions of such “relevance.” Examples of such information retrieval systems can be found in the book “Modern Information Retrieval” by Baeza-Yates, R. and Ribeiro-Neto, B. A., ACM Press/Addison-Wesley. 1999.
However, the language of advertising has evolved in a manner that often makes it difficult to easily determine relevance. For example, modern advertisements seek to communicate the maximum information in the fewest possible words. Advertisements are designed to be memorable, to elicit emotions or associations, to provide key information, and to imply meaning. But all of these objectives should be achieved with a small number of words.
Because of the brevity of modern advertisements, words are very carefully chosen so as to imply information without necessarily stating it directly. For example, the slogan “I can't believe it's not butter!” implies that butter is preferable, and that this product is indistinguishable from butter. Furthermore, advertisers make use of slogans or cultural associations to carry the advertising message, such as the slogan “Got milk?”.
The brevity of modern advertisements presents a challenge to contextual advertisement selection systems. Since there are few terms in a modern advertisement representation, an advertisement may not contain any terms directly identifying the product or product category. In fact, a modern advertisement representation may not directly contain any content terms. Traditional information retrieval techniques will largely fail in these cases because of the inherent compactness and brevity of the advertisement representation. Understanding a modern advertisement involves inference processes that can be quite sophisticated and well beyond what traditional information retrieval systems are designed to cope with. Due to these difficulties, it would be desirable to have advertisement selection systems that extend beyond simple concepts of relevance handled by existing information retrieval systems.