The World Wide Web (WWW) of computers is a large collection of computers operated under a client-server computer network model. In a client-server computer network, a client computer requests information from a server computer. In response to the request, the server computer passes the requested information to the client computer. Server computers are typically operated by large information providers, such as commercial organizations, governmental units, and universities. Client computers are typically operated by individuals.
To insure inter-operability in a client-server computer network, various protocols are observed. For example, a protocol known as the Hypertext Transfer Protocol (HTTP) is used to move information across the WWW. In addition, the WWW observes a standard, known as Hypertext Markup Language (HTML), for organizing and presenting information.
The HTTP and HTML are heavily relied upon to distribute information such as news, product reviews, and literature. This information is typically free to the user, while its cost is underwritten by advertising. As with all advertiser-supported media, it is important to learn as much as possible about the customer. For example, advertisers like to know audience demographics (e.g., average age, gender distribution, etc.) to more accurately choose appropriate advertisements, while editors like to know audience preferences (e.g., favorite kinds of stories, most-read sections) to create more appealing content. This type of information is widely used in today's mass media, including TV, radio, magazines, and newspapers. However, the WWW has the potential for a finer-grained collection of information about customers because point-to-point connections are established between a client computer and a server computer.
There are several existing mechanisms in the HTTP to support the collection of customer information. One mechanism is the voluntary registration process, which involves a customer providing personal information in exchange for access to otherwise restricted media content. Another mechanism is passive tracking, which provides information on the requests made by a customer accessing a web site, the information includes such things as pages visited, data entered, and links clicked. After a voluntary registration process has been completed, HTTP authentication is used for subsequent visits to the same site. In the subsequent visit, the customer avoids the registration process by entering a user name and password. Once the user enters this information, the server computer recognizes the customer. This mechanism supports the collection of both active demographics and passive tracking.
Although authentication provides rich functionality, its use has fallen into great disfavor since users do not want to remember a user name and password for each site requiring authentication. Site managers recognize that the use of authentication discourages users from visiting a site. Thus, it would be highly desirable to provide a mechanism for avoiding authentication operations, while still preserving access to registration data.
A shortcoming also exists in the operation of passive tracking. Passive tracking is typically performed through a mechanism formally referred to as persistent client-side state, and informally referred to as "cookies". Persistent client-side state allows a site (server computer) to store and retrieve information within the web browser that a client computer uses to access the site. The information is effectively un-interpreted by the browser and thus can be used by the server computer for any purpose.
Appropriate use of cookies allows the collection of passive tracking information. In particular, a server stores a unique value in each browser's cookie and makes a corresponding entry in a database for that value. This allows the site, via the embedded cookie value, to associate persistent information with that person. In addition, the server may log the cookie associated with each request. This will allow the association of requests with a person.
Cookies are a general, powerful mechanism for tracking user activity. It is rather straightforward to use cookies to track a user within a single site. Such tracking is useful. However, it would be even more useful to be able to track a user across multiple web sites. This would allow the collection and correlation of additional customer information. Unfortunately, a security feature associated with cookies prevents the tracking of a user across web sites with distinct domain names.
The full cookie specification is described on the WWW at "http://home.netscape.com/newsref/std/cookie.sub.-- spec.html". The relevant aspects of the cookie specification as it relates to the present invention are described below.
When a server computer returns an HTTP object to a client it may include a piece of state information that the client computer can store. Included in that state object is a description of the range of URLs for which that state is valid. A URL, or Uniform Resource Locator, specifies a computer and a file. A typical URL is http://SU/123. This URL is an instruction to retrieve the file 123 from the State University computer "SU" using the HTTP. A URL may also be used to invoke a specified function on a remote computer, with the remote computer returning the results of the invoked and executed function.
The state information that is passed to the client computer typically observes the following syntax: Set-Cookie: NAME=VALUE; domain=DOMAIN.sub.-- NAME. The term VALUE is a sequence of characters that is typically used to specify a user identification value. The DOMAIN.sub.-- NAME specifies the set of domains over which the state information can be accessed. When the browser searches the cookie list for valid cookies, a comparison of the domain attributes of the cookie is made with the internet domain name of the host from which the URL will be fetched. If there is a tail match, then the cookie will go through path matching to see if it should be sent. "Tail matching" means that a domain attribute is matched against the tail of the fully qualified domain name of the host. For example, a domain attribute of "acme.com" would match host names "anvil.acme.com" as well as "shipping.crate.acme.com".
Only hosts (server computers) within the specified domain can set a cookie for a domain. It is this security feature that prevents the tracking of a user across web sites. While site A would like to see the cookie set by site B so that site A can access information about the user's behavior on site B, the domain specification security feature of cookies prevents site A from seeing or manipulating a cookie set by site B, if site A and site B have distinct domain names.
In view of the foregoing, it would be highly desirable to perform passive tracking of a web browser as it makes requests to distinct domain names of the WWW. As indicated above, such information would allow editors and advertisers to tailor their content to users.