This specification relates to data processing and generating targeting data.
The Internet provides access to a wide variety of resources. For example, video and/or audio files, as well as web pages for particular subjects or particular news articles, are accessible over the Internet. Access to these resources presents opportunities for distribution of targeted content items (e.g., targeted advertisements) with the resources. For example, a web page can include content item slots in which content items, such as advertisements, can be presented. A content item slot is a portion of a resource in which content items are selectively presented. For example, a resource can include a code snippet that includes instructions for obtaining a content item for presentation in a specified portion of the resource. Content item slots can be defined in the resource itself or defined for presentation with a resource, for example, in a separate browser window.
The content items that are selected for presentation in the content item slots can be provided by content providers (e.g., advertising providers) and/or selected through an auction. For example, content providers can provide bids specifying amounts that the sponsors are respectively willing to pay for presentation of their content. In turn, an auction can be performed, and content items can be selected for presentation based, at least in part, the bids, the relevance of the respective content items to resource content that is presented on the resource in which the content item slot is defined, and/or the relevance of the respective content items to information that is included in a request for content items.
The relevance of the respective content items relative to resource content and/or the information that is included in the request can be based, for example, on a measure of similarity (e.g., cosine similarity measure or outcome of a clustering technique) between the content item and the resource content or request information. Some content providers target distribution of their content items using targeting criteria, such as targeting keywords to increase the likelihood that the content item will be presented with content to which the content item is relevant. For example, a search query that is received from a user device can be required to have at least a minimum specified similarity to the targeting keyword for a content item in order for the content item to be eligible for distribution with a responsive search results page. However, it can be difficult for a content provider to anticipate every search query for which the content provider wants their content item to be eligible for distribution.