An advertiser, such as Ford® or McDonald's®, generally contracts a creative agency for ads to be placed in various media for the advertiser's products. Such media may include TV, radio, Internet ads (e.g., banner display ads, textual ads, streaming ads, mobile phone ads) or print media ads (e.g., ads in newspapers, magazines and posters). It is quite possible that the advertiser may engage one or more creative agencies that specialize in creating ads for one or more of the above media. A company wants to show the most relevant ads to end users in order to get the most value from their ad campaign.
A company like Yahoo!® gathers enormous amounts of data related to IP (Internet Protocol) addresses of end user computers. For example, the company may gather event data, including data related to end user behavior on the Internet. Such behavior may include, for example, clicks on ads. The company sees IP addresses from which the company can usually infer zip codes and even street-level data. The company sees login information and sees the pages that end users visit. The company may infer age, gender, income and other demographic information from analyzing the pages an end user visits even if the end user never does a search. The company may also gather valuable search data when end users perform search queries. All of this data is highly valuable to any company that advertises because the data may help the company advertise in the most effective way.
The search advertising marketplace generates billions of dollars in revenue each year for a search engine company like Yahoo!®. The search marketing marketplace works on a cost-per-click (CPC) model. When an end user performs a search query online and clicks on a sponsored search text ad, a company like Yahoo!® is paid by the respective consumer (e.g., advertiser). End users tend to click on more relevant ads.
A consumer (e.g., advertiser) that utilizes data from a search engine wants to show the most relevant ads to end users in order to get more clicks on the consumer's ads. In order to do this, the consumer needs to gather end users' events, such as search behavior, click behavior and other browsing behavior. The company may then use these events to target relevant ads to different end users.
In the CPC model, there are two important events that go through a data pipeline—search events and click events. Search events occur when an end user performs a search query. Click events occur when an end user clicks on a sponsored text ad. Web servers of a company like Yahoo!® collect search events when an end user performs a query on the company's search page. URLs of the ads on the search result webpage may contain the click event information. A consumer (e.g., advertiser) may want to collect and analyze the search and click events in order to build a model for query-to-text ad relevance. If the consumer can learn which ads are more relevant, then the consumer can target these ads to end users and get a higher click-through rate (CTR).
The amount of data gather by a search engine company, such as Yahoo!®, is tremendous. The amount of data is typically in the order of petabytes per day. Unfortunately, conventional systems for providing events to consumers (e.g., advertisers) are inefficient.