1. Technical Field
The present invention relates generally to data collection in distributed networks.
2. Brief Description of the Related Art
Distributed computer systems are well-known in the prior art. One such distributed computer system is a “content delivery network” or “CDN” that is operated and managed by a service provider. The service provider typically provides the service on behalf of third parties. A “distributed system” of this type typically refers to a collection of autonomous computers linked by a network or networks, together with the software, systems, protocols and techniques designed to facilitate various services, such as content delivery or the support of outsourced site infrastructure. Typically, “content delivery” means the storage, caching, or transmission of content, streaming media and applications on behalf of content providers, including ancillary technologies used therewith including, without limitation, DNS request handling, provisioning, data monitoring and reporting, content targeting, personalization, and business intelligence. The term “outsourced site infrastructure” means the distributed systems and associated technologies that enable an entity to operate and/or manage a third party's Web site infrastructure, in whole or in part, on the third party's behalf.
Web servers deliver web-based content to Web browsers over the protocol known as HTTP. Because HTTP is a stateless protocol, a known HTTP protocol extension enables a Web server to provide state information to a requesting end user Web browser. In particular, a Web server may include in its reply a header that instructs the client to remember a small piece of state information (a “cookie”), and to include a copy of that information in future requests to the server. In this way, the Web server can track whether it has seen the client browser previously, and this tracking information can be used to build a browser-specific profile that may then be used to inform some other control function, e.g., what type of advertisement to serve within a web page that will be delivered to the browser. According to convention and practice, Web servers set cookies with values only within their own domain, which ensures that cookies are only sent back to the same web domain from which they came. This convention notwithstanding, there have been efforts to share cookies across content domains so that content preferences and interests associated with the individual using the Web browser can be identified. Thus, for example, in U.S. Pat. No. 6,073,241, a set of cooperating servers share cookie information via a shared database. In U.S. Patent Application No. 20020007317, client state information is placed in one or more cookies that are then shared across disjoint domains in a virtual shopping mall environment. The servers are non-cooperating, and an intermediary application is used to add state information to client requests and responses.
It is also known that ad serving companies have the capability to and do collect and correlate cookie data reflecting that a given Web browser has visited unaffiliated sites on which the company's ads have been served. The ad serving company can then use this data to build an end user profile.