1. Technical Field
The present disclosure generally relates to the field of automated monitoring technologies and techniques. More particularly, and without limitation, the disclosure relates to computer-implemented methods and systems for scanning web sites and/or parsing web content, including for testing online opt-out systems and/or online content.
2. Background
As greater numbers of people use the World Wide Web for communication, commerce, and other daily activities, they generate larger and larger volumes of traffic over the Internet. Because the benefits of commercializing the Internet can be tremendous, businesses increasingly take advantage of this traffic by advertising their products or services online. These advertisements may appear in the form of leased advertising space (e.g., “banners”) on content websites, which are operated by “publishers” or “advertising networks” who control the website content and the availability and cost of the advertising space or “ad inventory.”
In most cases, there is a need to keep track of the number of ad impressions, and/or the quantity of click and conversion events related to advertisements. An tracking system is, therefore, necessary to keep track of ad-related events. An advertiser may also have a marketing plan that identifies certain types of people as being target audience members for a given product or service. For example, the advertiser may wish to spend money only on users having certain demographics or personal interests. Alternatively, advertisers may be unsure of which people are most likely to respond to a given product, service, or advertisement. Therefore, advertisers may wish to obtain very specific information about the types of consumers viewing various types of web sites and responding to their advertisements. In some cases, advertisers may be willing to spend more money per impression, click, or conversion based on known information about those users interacting with the advertisements. As a result, publishers of content websites and/or facilitators of third party advertising networks (i.e., owners or operators of “advertising systems”) may wish to obtain as much information as possible about consumers and other users browsing between web pages associated with an advertising network.
In order to implement tracking systems and/or obtain information for providing targeted ads, advertising systems may utilize tracking or browser cookies. In general, cookies comprise small sets of data that can be stored on users' computers by web browsers. Cookies may be implemented using so-called HTTP or web cookies, as well as other known types of cookies such as Flash cookies, “Evercookies,” “Browser Fingerprinting,” etc. When a user's computer visits a website or sends a request for a file such as a banner ad, an advertising system may transmit instructions to the user's web browser to store a cookie. The cookie is stored for a specified time and returned whenever the user's computer makes a subsequent visit to the website or another website on the same advertising network or system. The cookie may include one or more name-value pairs that specifies information, such as a user's preferences or browsing history. This information may allow the advertising system to track ad-related events and/or display targeted ads to the user based on collected information.
Cookies are also be used by online merchants and web service providers to support complex interactions and provide a better browsing experience for Internet users. For example, cookies may be used to facilitate a user log-in or provide information on the contents of an electronic shopping cart. Cookies can also be used to store registration information or secure information. As with cookies used by advertising systems, cookies used by online merchants and web service providers can be stored for a specified time and returned during subsequent requests by a user's web browser.
For privacy or other reasons, an individual who does not want to receive targeted or behavioral advertising may elect to “opt-out,” In general, opting-cut is a process by which a user may avoid receiving, for example, targeted advertisements, including those initiated by information collected from cookies or other data sources. Users may enroll in an opt-out system through various means, including a privacy policy page or an opt-out page of a website. Some websites, such as http://www.networkadvertising.org, enable users to selectively opt-out of targeted ad programs for one or more advertising networks or systems.
In some cases, opt-out system failures may occur that improperly prevent users from being opted-out. The owners or operators of the system that provide cookies may not be aware of opt-out system failures until they are alerted by a user or a third-party group. Efforts to manually ensure that an opt-out system operates properly may be unrealistic due to the large number of contingencies and possible failures that need to be checked on a continuous basis. Such efforts may require an enormous amount of resources. Furthermore, such efforts may not allow the owners or operators of an advertising system to respond effectively if a failure were to be discovered.
In view of the foregoing, there is a need for automated methods and systems that can accurately test online opt-out systems. There is also a need for such methods and systems, where the testing of an online op-out system can be conducted on a continuous basis for one or more systems and for a wide variety of contingencies and possible failures. Moreover, there is a need for improved methods and systems for testing online op-out systems that provide reporting features that enable owners or operators of advertising systems to respond effectively when a failure is detected.
The herein disclosed embodiments are directed to achieving one or more of the above-referenced goals, by providing methods and systems for analyzing and testing online opt-out systems.