Providing targeted content can be beneficial to both the provider and the recipient. For example, in an advertising context, both the advertiser and the consumer benefit from targeted ads; the consumer receives ads that are relevant to his or her interests and the advertiser gets improved response to those targeted ads. In order to provide targeted content, the provider must both possess and effectively utilize information about the recipient and further the provider must also posses and effectively utilize information about the content from which the selected content will be selected.
Accordingly, it may be beneficial to provide targeted content, such as, for example, targeted advertisements on a web page. However, there are known problems in scenarios such as these in both acquiring information about the recipient of the advertisements and effectively utilizing that information to provide relevant targeted advertisements.
The problem of acquiring information about a recipient, and specifically a recipient of advertisements on a web page, is known as a classification problem. A significant portion of this classification problem is in classifying the current context of the recipient. There are two common approaches to the context classification problem typically associated with providing targeted content, particularly in providing targeted advertising on a web page: the bucket of words approach and natural language processing.
The bucket of words approach utilizes a context independent analysis of text to determine which words are being used more often than statistically expected in order to determine the subject matter of the text. This approach can be applied to both the web page content and the advertisement content. For example, through analysis of a web page it may be determined that the words “allergy” and “pollen” appears more often than statistically expected. The bucket of words approach interprets this occurrence as demonstrating the web page content is directed to seasonal allergies. The content provider may then use the results of that analysis to determine that visitors to this web page are more likely than the general population to be interested in advertisements regarding seasonal allergy medication and provide an appropriately targeted advertisement. The bucket of words solution is a fairly inaccurate solution in that the words are analyzed without regard to context and relationship to other words on the web page. Unfortunately, this solution often does not provide strong contextual relationships and the results can be skewed heavily by inadequate and/or false information and, therefore, is not optimally targeted.
The natural language processing approach utilizes the basic concepts of the bucket of words approach, but uses contextual extraction (e.g., noun, verb, etc.) to improve the accuracy of the results. Although this approach improves the accuracy of the results, it is also a much slower process, particularly because the content of the web page must be prefiltered in order for the analysis to be effective. Because certain contextual clues are dependent on the vertical market addressed by the web page (the subject matter, i.e., trade based content, content based on specialized needs, for example, medical, mechanical engineering, etc.) different filters must be used for each vertical market. Prefiltering involves human involvement in the process and therefore decreases the efficiency of the process by requiring important steps to be performed offline. As a result, natural language processing cannot be used to run an online real-time analysis of web pages to provide targeted content.
While it is possible to apply the bucket of words approach and the natural language processing approach to classify the targeted content, in many cases related web pages and advertisements are difficult to match together because the classification trees for each are not congruous, even though the subject matter may be. These problems can be dealt with by adding another layer of human involvement in the process, further decreasing efficiency, or by accepting further limitations on optimizing the targeted content.
The bucket of words approach and the natural language processing approach are therefore not complete solutions to the problems associated with providing targeted content. The results provided by these approaches are simply groups of words, such as grammar graphs, that may be used to identify the context of the group of words analyzed. However, these sets of words do not provide any map or instructions to link the words/context to targeted content. Moreover, neither solution is capable of analyzing large numbers of words with respect to each of the other words in the set. For example, a naïve Bayes classifier, or similar independent feature model, is only capable of computing pairs or tuples at best, before the model becomes too complex and computationally intractable.
A typical solution for online processing problems is to add more processing power. However, the challenges presented by the classification problem cannot be simply addressed by increasing the processing power of the system. Accordingly, an entirely new approach must be developed in order to provide an improved solution to the classification problem for providing targeted content.
Therefore, a need exists for a system and method wherein targeted content can efficiently be provided while also providing a strong contextual relationship.