This specification describes technologies relating to data processing and model development.
The Internet provides access to a wide variety of resources. For example, video and/or audio files, as well as search results pages and web pages for particular subjects or particular news articles are accessible over the Internet. Access to these resources presents opportunities for advertisements to be provided with the resources. For example, a web page can include advertisement slots in which advertisements can be presented. These advertisements slots can be defined in the web page or defined for presentation with a web page, for example, in a pop-up window.
When a web page (or another resource) is requested by a user, an advertisement request is generated and transmitted to an advertisement management system that selects advertisements for presentation in the advertisement slots. The advertisement management system selects advertisements, for example, based on characteristics of the web page with which the advertisements will be presented, demographic information about the user to whom the advertisements will be presented, and/or other information about the environment in which the advertisement will be presented.
The advertisements that are provided in response to an advertisement request can be required (e.g., according to terms of use) to comply with a set of advertising guidelines. These advertising guidelines may specify, for example, content that can be included in advertisements and/or content that cannot be included in the advertisements. An example advertisement guideline may specify that an advertisement cannot include misleading or inaccurate claims. For example, an advertisement that claims that a user can make $500,000 a year by simply sending the advertiser $50 is likely to be in violation of the advertising guidelines.
Generally, advertising effectiveness and/or user satisfaction increases when the quantity of violating advertisements (i.e., advertisements that violate the advertising guidelines) is limited. Classification models can be used to identify violating advertisements (i.e., advertisements that violate one or more of the advertising guidelines), for example, based on characteristics of the advertisement and/or the resource (e.g., web page) to which users are redirected following interaction with the advertisement. Manually creating these models can be difficult and time consuming.