Contextual advertising aims at delivering the most relevant advertisements (ads) based on the extracted keywords from a web page. It is one of the most successful business models to monetize the traffic from the publisher. However, traditional keyword extraction methods mainly rely on frequency of “words” and occurrence position of words, rather than content. This may result in irrelevant ad suggestions.
Keyword advertising services, such as content-targeted advertising have grown to become a primary revenue source for many web service providers, and a significant part of the search engine market. A typical content-based advertising service extracts a few representative keywords from a given web page, and then uses these keywords to search relevant advertisements against a huge repository of ads. The selected ads are then displayed together with the web page, and made visible to the user. If a user clicks on the link of an ad, the advertiser is charged a fee that is shared by both the web page owner and the advertising service provider. Accurate keyword extraction from the web page is critical to ensure the delivery of relevant ads to the right users, therefore enabling collection of higher income for both the web page owner and the advertising service. It has been shown that there exist a strong correlation between the accuracy of keyword extraction and the click-through-rate of delivered ads.
Current keyword extraction algorithms may be based on either heuristic rules, or supervised learning. These methods may only use term frequency and document structural information, and may not leverage words' semantics. These methods may result in inconsistent outputs, and degrade a system's accuracy.