This disclosure relates to processing network traffic statistics.
A publisher is an entity that owns and/or manages a web site. Using analytical services offered by third parties, the publisher can monitor analytical data related to user visits and links to the web site. Example analytical data includes data related to domains and/or web sites from which visitors arrived and to which the visitors departed; traffic patterns, e.g., navigation clicks of visitors within the publisher's web site; visitor actions, e.g., purchases, filling out of forms, etc., and other actions that a visitor may take in relation to the publisher's web site. The analysis of such analytical data can inform the publisher of how the visitors were referred to the publisher's web site, whether an advertising campaign resulted in the referral, and how the visitors interacted with the publisher's web site. With this understanding, the publisher can implement changes to increase revenue generation and/or improve the visitor experience. For example, a publisher can focus marketing resources on advertising campaigns, review referrals from other web sites, identify other publishers as potential partners for cross-linking, and so on.
One example analytical system that provides analytical tools that facilitate the collection and analysis of such analytical data is provided by Google™ Analytics, available from Google, Inc., in Mountain View, Calif. To use such systems, a publisher typically provides tracking request code embedded in its web pages. Typically the tracking request code is a snippet of JavaScript™ code that the publisher adds onto every page of their web site for which traffic is to be tracked. When the page is requested by a user device, the tracking request code determines if the tracking code is stored in a browser cache on the user device. If the tracking code is not stored in the browser cache, the tracking request code requests and downloads tracking code from an analytics server. The tracking code is then stored in the browser cache on the user device and executed.
The tracking code collects visitor data and sends it back to the analytics server in the form of a tracking data communication for processing. The tracking data communication includes an account identifier that identifies an analytics account of the publisher, a visitor identifier that identifies the visitor, and event statistics, such as whether the visitor has been to the web site before, the timestamp of the current visit, referrer data identifying the referrer site, campaign data identifying the advertising campaign the visitor came from, and other event statistics.
In addition to providing tracking data communications to the analytics server, the tracking code sets one or more corresponding cookies in the visitor's browser. The cookies are used to store information related to the tracking data communications, such as the number of times the visitor has been to the web site, the time of the current visit, referrer data, and campaign data.
While the use of cookies works well for tracking information for a property such as a web site, reliance on cookies can, in some situations, result in ambiguous event statistics. Ambiguous event statistics are event statistics that do not quantify the actual states of events, and are caused by the coupling of event statistics for sub-properties within a property, or by stateless event statistics.
In the context of a web site, a property and a sub-property are any two resource environments that share a common cookie. For example, a property can be a web site, and a sub-property can be one or more web pages within the web site. Many properties include sub-properties that are sponsored, maintained or hosted in part by entities other than the web site publisher. For example, a video sharing web site may include sub-properties that are sponsored by third parties. Examples of such sub-properties include pages for particular networks, pages for news organizations, and pages for particular companies, to name just a few. These sub-properties are usually located on web pages hosted by a web server of the property. For example, the web site YouTube includes brand channels for many networks, film distributors, and news organizations, and each brand channel is maintained, in part, by its corresponding sponsor.
Often these third parties desire to monitor the analytical data for their respective pages. However, as cookies are persisted on a domain basis, event statistics for several sub-properties can be coupled. Coupled event statistics occur when event statistics for a sub-property hosted at a web site property are coupled, e.g., aggregated at the client side, with event statistics for another sub-property hosted at that web site. For example, a user device may request, in succession, pages for three different sup-properties hosted at a particular property. With each page request, the cookies that are persisted for the property domain are updated relative to previous page requests. As a result, for the third page request, the event statistics indicate that the visitor has visited the page three times, as three page requests for that domain have been generated. However, with respect to each sub-property, the visitor has only visited each page once.
In the context of a web page, a property can be the web page and the sub-property can be an application embedded within the web page. For example, a web page may include embedded applications, or “gadgets,” for which an operating environment is rendered in the web page. Such gadgets can include stock reporting applications, weather reporting applications, and e-mail reporting applications. These gadgets may be hosted by other web sites. However, most web browsers prohibit cross-domain cookies due to security and privacy concerns. Thus, gadgets and rich media advertisements may have inaccurate tracking data as well due to the coupling of event statistics.
Another example of ambiguous event statistics is stateless statistics. Stateless event statistics are event statistics that are independent of prior event statistics in prior tracking data communications. Stateless event statistics often occur in cookieless environments. A cookieless environment is an environment in which cookies cannot be persisted or used, such as when a user disables the use of cookies in a browser, or such as may exist in many mobile devices. In such environments, the event statistics included in the tracking data do not reflect the occurrence of earlier events. For example, the event statistics may specify that each page request for a visitor is the first request for that visitor, as there is no client-side storage of prior event statistics.