1. Field of the Invention
The present invention relates to the field of computer systems, and more particularly to a method and system for tracking customer usage of content found on computer systems.
2. Brief Description of Prior Developments
There has recently been a tremendous growth in the number of computers connected to the Internet. A client computer connected to the Internet can download digital information from server computers. Client application software typically accepts commands from a user and obtains data and services by sending requests to server applications running on the server computers. A number of protocols are used to exchange commands and data between computers connected to the Internet. The protocols include the File Transfer Protocol (FTP), the Hyper Text Transfer Protocol (HTTP), the Simple Mail Transfer Protocol (SMTP), and the Gopher document protocol. The HTTP protocol is used to access data on the World Wide Web, often referred to as “the Web.” The Web is an information service on the Internet providing documents and links between documents. It is made up of numerous Web sites located around the world that maintain and distribute electronic documents. A Web site may use one or more Web server computers that store and distribute documents in a number of formats, including the Hyper Text Markup Language (HTML). An HTML document includes text and metadata (commands providing formatting information), as well as embedded links that reference other data or documents. The referenced documents may represent text, graphics, or video. In addition, HTML documents may contain client scripts (e.g. Java Script or Visual Basic Script) that are executed on the browser. The browser executes these scripts in a scripting space. A script is a set of instructions that are executed at certain times, e.g. when a Web page is loading, when a Web page is done loading, when the user has clicked on a link, when an event has occurred, etc.
An intranet is a local area network containing Web servers and client computers operating in a manner similar to the World Wide Web described above. Typically, all of the computers on an intranet are contained within a company or organization.
A client computer connected to a network, such as a local area network, wide area network, an intranet, or the Internet, can download digital information from server computers. This digital information can be presented to a user with and executed by a Web browser.
Generally, the HTTP protocol is considered a “stateless” protocol, that is, the protocol is structured so that it does not require the cooperating web browser to maintain state information about the data that is communicated. Since HTTP is a “stateless” (non-persistent) protocol, it is impossible to differentiate between visits to a web site among a group of visitors, unless the server can somehow “mark” a visitor.
However, the HTTP “stateless” protocol shortcoming is overcome by today's Web browser applications through the use of “Cookies” technologies. In computer science terms, a cookie is an opaque piece of data held by an intermediary. In today's Web browser applications, Persistent Client State HTTP Cookies, more commonly known simply as Cookies, add persistence (i.e. state information) to Web networks by letting Web application developers store information on the client, such as, user names and preferences, so that this information is available from Web browsing session to Web browsing session. Cookies become a very useful tool in maintaining state variables on the Web. With cookies, Web developers (and operators) are afforded the ability to identify web users that navigate to and through Web sites during a Web browsing session.
In operation, cookies are a general mechanism that server side connections (such as CGI scripts) can use to both store and retrieve the information on the client side of the connection. The addition of a simple persistent, client-side state significantly extends the capabilities of Web-based client/server applications. A server, when returning an HTTP object to a client, may also send a piece of state information that the client will store. Included in that state object is a description of the range of URLs for which that state is valid. Any future HTTP requests made by the client which fall in that range will include a transmittal of the current value of the state object (i.e. “cookie”) from the client back to the server.
As a source of state information, there are many reasons why a given Web site operator would wish to use cookies. These reasons range from the ability to allow Web site users to personalize information on visited Web sites, or to assist web-site operators in processing on-line sales/services, or simply for the purposes of tracking popular links or demographics associated with a Web user. Cookies also provide programmers with a quick and convenient means for keeping site content fresh and relevant to the user's interests. Additionally, a new application of cookie technology is to assist with back-end processing performed by Web sites. In this context, cookies may be used by Web sites to securely store data that a user may have shared with a visited Web site. Such information may be used to reduce the amount of processing performed by a Web site when a user subsequently visits the Web site.
Specifically, cookies generally consist of text-only strings that a web browser on a client computer can store until they expire. Cookies that have an expiration date are called persistent cookies, and survive browser sessions. Cookies without an expiration date are called session cookies, and they are only valid for the current browser session. A cookie is typically introduced to a client computer in the form of HTTP response header that is created and subsequently transferred by a server computer; it can also be introduced by client side script on the web browser. The header typically contains the domain name, path, lifetime (in form of expiration date/time), and additional space for operator defined variables that are set by the visited sites. If the lifetime variable of a given cookie is longer than the time the user spends at the site, then this string is saved to file for future reference.
For example, in a typical Web site navigation scenario, a user may request information from a particular Web site that maintains a unique URL address. If the user is visiting the Web site for the first time, the Web site may request of the user's browser to create a new cookie that is associated with the requested URL. In the alternative, if the user has already visited the targeted Web site, the Web site sends a request to process and update the already created cookie associated with the Web site URL. In this way, the server knows whether the user has visited the targeted Web site before and can coordinate the user's preferences for different Web pages on the targeted Web site. Armed with this information, Internet and Web site content providers can utilize cookies in a variety of ways including, to customize a Internet or Web user's surfing experience.
Given an increase in the number of Internet and Web site content providers, there is increasing competition among these providers to provide distinguishing features in their content to enhance a user's experience. It is hoped that the development of such features would translate into an increase in user traffic. In that light, Internet and Web site content providers are constantly looking for ways to track the behavior of computer users that visit their content. Having such information, a content provider could possibly customize information, product/service offerings based upon the computer user's surfing behavior. This information may serve as a basis to create distinguishing features that content providers seek in today's competitive marketplace. For example, some information sought to be tracked may include a computer user's point of entry to a given Internet site or, more importantly, the Internet page(s) and/or Internet sites that are visited by a computer user during an Internet surfing session.
On the Internet or in Web networks, a Web site is generally hosted on a domain. The domain provides a computer user with an address (e.g. a URL or portion thereof) by which he/she may gain access to a Web site or a group of Web sites. Accordingly, a domain may host a single Web site or provide the infrastructure to host multiple Web sites. In the latter case, a content provider supporting multiple Web sites on a singular domain may want to track the usage of hosted sites by a given client computer. Having such usage information, the content provider may generate specific affinities among the various sites visited by a client computer to customize content provided to a user during a domain Web surfing session.
However, content providers trying to track users through the use of traditional cookies are severely hampered. The use of traditional cookies is process intensive, placing a burden on the computer system hosting the content. In addition, conventional cookies technologies dictate that cookies created by a given domain are only sent back to the domain that originally sent them to the client computer, thereby making it difficult to track users who travel across several domains.
Stated differently, conventional methods to track computer users visiting Internet and Web sites have fallen short of content provider's needs, as they may require additional resources and burden processing. For example, certain content providers utilize and store computer user profiles created from computer user input that indicate preferences for a variety of information, products, and services. The use of user profiles, however, has several disadvantages including the need for abundant storage resources as large amounts of information are required to be stored for each user, and a lack of user privacy (i.e. the content provider maintains information that users may regard as confidential). Further, conventional methods for computer user tracking are domain specific. That is, the information gathered by existing tracking systems about a computer user's usage of Internet or Web site content is specifically limited to the navigation within one given domain during a computer user's Internet or Web site session.
It is thus desired to implement a tracking system for computer users that does not burden computing processing, guarantees anonymity, and is applicable to track computer users navigating to various content sites hosted on various domains. At the core of an infrastructure that could achieve these advantages is an apparatus and methods that would capitalize on existing Internet or Web browser application technologies.