§1.1 Field of the Invention
The present invention concerns advertising, such as online advertising for example. In particular, the present invention concerns selecting and/or scoring content-relevant advertisements (“ads”).
§1.2 Background Information
Advertising using traditional media, such as television, radio, newspapers and magazines, is well known. Unfortunately, even when armed with demographic studies and entirely reasonable assumptions about the typical audience of various media outlets, advertisers recognize that much of their advertising budget is simply wasted. Moreover, it is very difficult to identify and eliminate such waste.
Recently, advertising over more interactive media has become popular. For example, as the number of people using the Internet has exploded, advertisers have come to appreciate media and services offered over the Internet as a potentially powerful way to advertise.
Interactive advertising provides opportunities for advertisers to target their ads to a receptive audience. That is, targeted ads are more likely to be useful to end users since the ads may be relevant to a need inferred from some user activity (e.g., relevant to a user's search query to a search engine, relevant to content in a document requested by the user, etc.). Query keyword targeting has been used by search engines to deliver relevant ads. For example, the AdWords advertising system by Google Inc. of Mountain View, Calif. (referred to as “Google”), delivers ads targeted to keywords from search queries. Similarly, content targeted ad delivery systems have been proposed. For example, U.S. patent application Ser. No. 10/314,427 (incorporated herein by reference and referred to as “the '427 application”), titled “METHODS AND APPARATUS FOR SERVING RELEVANT ADVERTISEMENTS”, filed on Dec. 6, 2002 and listing Jeffrey A. Dean, Georges R. Harik and Paul Buchheit as inventors; and Ser. No. 10/375,900 (incorporated by reference and referred to as “the '900 application”), titled “SERVING ADVERTISEMENTS BASED ON CONTENT,” filed on Feb. 26, 2003 and listing Darrell Anderson, Paul Buchheit, Alex Carobus, Claire Cui, Jeffrey A. Dean, Georges R. Harik, Deepak Jindal and Narayanan Shivakumar as inventors, describe methods and apparatus for serving ads relevant to the content of a document, such as a Web page for example. Content targeted ad delivery systems, such as the AdSense advertising system by Google for example, have been used to serve ads on Web pages.
As can be appreciated from the foregoing, serving ads relevant to concepts of text in a text document is useful because such ads presumably concern a current user interest. Consequently, such online advertising has become increasingly popular. However, such content-targeted ad delivery systems can be improved. For example, the '900 application describes how so-called “targeting criteria” (or simply “criteria”, whether used in the singular or plural form), used to look up relevant ads, may be determined in an exemplary embodiment. Specifically, the '900 application describes that an off-line (perhaps nightly) dump of a complete ads database is used to generate an index that maps topics (e.g., a PHIL cluster identifiers) to a set of matching ad groups. This may be done using one or more of (i) a set of serving constraints (targeting criteria) within the ad group, (ii) text of the ads within the ad group, (iii) content on the advertiser's Web site, etc. U.S. Provisional Application Ser. No. 60/416,144 (incorporated herein by reference and referred to as “the '144 application”), titled “Methods and Apparatus for Probabilistic Hierarchical Inferential Learner” filed on Oct. 3, 2002 and U.S. patent application Ser. No. 10/676,571 (referred to as “the '571 application” and incorporated herein by reference), titled “Methods and Apparatus for Characterizing Documents Based on Cluster Related Words,” filed on Sep. 30, 2003 and listing Georges Harik and Noam Shazeer as inventors describe exemplary ways, that may be used in a manner consistent with the principles of the present invention, to determine one or more concepts or topics of information.
The '900 application further describes that a document may be associated with one or more ads using a document identifier (e.g., a URL) to determine one or more ads. For example, the document information may have been processed to generate relevance information, such as a cluster (e.g., a PHIL cluster), a topic, etc. The document clusters may then be used as query terms in a large OR query to an index that maps topics (e.g., PHIL cluster identifiers) to a set of matching ad groups, via the index that maps topics to ad groups.
The results of this query may then be used as first cut set of candidate criteria. More specifically, the candidate ad groups may then be used to determine an actual information retrieval (IR) score for each ad group summarizing how well the criteria information plus the ad text itself matches the document relevance information. Estimated or known performance parameters (e.g., selection rates, conversion rates, etc.) for the ad group may be considered in helping determine the best scoring ad group(s). Targeting criteria associated with the best scoring ad group(s) can be used as “criteria” to determine a final set of ads.
A content-relevant an ad server can use the set of one or more “criteria” to request ads. The provided ads may participate in an arbitration (e.g., an auction) to place the ads in available ad spots, to provide the ads with enhanced features or treatments (e.g., enhanced colors, enhanced fonts, images, animation, etc.), etc.
There are many ways of selecting a set of ads given a set of one or more “criteria.” For example, a requestor may request that an ad be sent back if K of the M criteria sent match a single ad group. One version of AdSense from Google determined a set of ads give a set of one or more criteria as follows. Suppose a list of the 60 best criteria is provided. Such criteria could be grouped into a sequence of queries, such as:raw_query0=“crit0—OR_crit1—OR_crit2 . . . _OR_crit15”;raw_query1=“crit16—OR_crit17—OR_crit18 . . . _OR_crit30”;raw_query2=“crit31—OR_crit32—OR_crit33 . . . _OR_crit45”; andraw_query3=“crit46—OR_crit47—OR_crit48 . . . _OR_crit60”.  [1]for example. Each of the queries could be processed in sequence until any ad, without regard to the number of ads, is returned. For example, raw_query0 could be processed. If any ads matched any of crit0 through crit15, these ads would be returned (for subsequent processing) and the other raw queries would not be processed. If no ads were returned for raw_query0, raw_query1 would be processed, and so on.
Although this approach has worked well, it has room for improvement. More specifically, opportunities to fill ad spots may be lost, criteria-to-document relevancy information can be ignored (or at least diluted), and multiple requests can lead to complexity and wasted resources. Each of these limitations is addressed below.
With regard to lost opportunities to fill ad spots, by stopping the process once one of the raw_queries returns at least one ad, there is a potential for unfilled ad spots. For example, suppose that a document has six (6) ad spots (and a requestor wants six (or more) ads). Suppose further that raw_query0 (ultimately) produces no ads, but raw_query1 (ultimately) produces one ad. The process is stopped at this point and five ad spots are left unfilled. Suppose that raw_query2 would have produced 20 ads. Since raw_query2 is never processed, only one ad is shown. This is a wasted opportunity to show relevant ads, which is a lost opportunity to generate revenue for the ad serving system.
With regard to ignoring or diluting criteria-to-document relevancy information, although the groups of queries are ordered such that an earlier processed group (e.g., raw_query0) has more relevant criteria than a potentially later processed group (e.g., raw_query1, raw_query2, etc.), within a group, all criteria are treated equally. Consider, for example, raw_query0. Since the criteria have been ranked by relevancy, crit1 may be much more relevant to the content of the document than crit9. However, suppose that raw_query0 returns two ads—ad A and ad B. Suppose further that ad A was returned because it had a targeting criteria that matched crit1, while ad B was returned because it had a targeting criteria that matched crit9. Thus, ad A is more relevant to the document content than ad B. Unfortunately, however, this fact is ignored in an arbitration in which ad A and ad B compete.
Finally, with regard to complexity, processing multiple requests (e.g., a second and perhaps even a third request) leads to complexity and extra load on processing and communications resources, particularly in a distributed environment.
Accordingly, given a set of ordered criteria, it would be useful to improve an ad server to generate more ads, to generate more relevant ads, and/or to reduce the load on processing, communication and/or storage resources.