The invention relates generally to electronic commerce and metering. More particularly, the invention relates to a method and apparatus for automatically, securely and accurately metering the number and duration of visits to a website.
There has recently been an increase in the popularity of the Internet, and in particular, the World Wide Web (WWW). This popularity is based in part upon the rich amount and diversity of information available through the Internet and WWW. Some experts estimate that 83 million people currently have access to the WWW, and that 199 million will have access by the year 1999.
The growing popularity of the Internet and WWW is driving various applications, several of which are commercially oriented. One such commercial application of the WWW is advertising. The WWW is particularly well-suited for advertising since it offers a relatively fast and effective means of mass distributing information. The difficulty of using the WWW for advertising, however, is the need for securely metering the distribution circulation to accurately price the advertisements.
To illustrate this difficulty, a short description of how information is organized on the WWW may prove helpful. A fundamental logical grouping of information used in the WWW is a hypertext file. A hypertext file is typically written in a programming language such as Hypertext Markup Language (HTML). An individual uses a browser on a computer to display a hypertext file in the form of a page. A page comprises information in various forms, such as text, graphics, images, video or sound.
One or more links may be embedded in each page. A link is a logical connection to information or programs located at a predetermined address on the Internet. This predetermined address is referred to as a Universal Resource Locator (URL). By selecting a link, a user can cause the browser to display a different portion of the same page, display a different page (known as the linked page), expand an image, execute a computer program, and so forth.
A logical grouping of hypertext files (i.e., pages) is called a site. Sites may reside on different computers. A set of sites that are interconnected by links is referred to as a web. A site on a first computer may be effectively linked to a site on a second computer by connecting the first and second computers through a network. The WWW is an example of a set of sites residing on different computers interconnected by a network.
The WWW can therefore be loosely defined as a set of sites storing hypertext files written in HTML on computers interconnected by the Internet. Each site on the WWW is known as a website. A website resides on a computer known as a server, which is accessed through a network by a user utilizing a client computer and a browser located at the client computer.
The term client computer as used herein refers to a system with a microprocessor and means for storing data and/or software such as random access memory and/or a hard disk drive, and which is capable of communicating with a network. The client computer is capable of providing output for display to a user, for example through a video display. Such output may take the form of at least one of textual, graphic, animation, video, audio, or virtual object media. The client computer is also capable of accepting input from a user. Such input may be provided by means such as a keyboard, a mouse, a telephone touch pad, a television remote control, and so on.
Similarly, the term server computer as used herein refers to a system with a microprocessor and means for storing data and/or software such as random access memory and/or a hard disk drive, and which is capable of communicating with a network. Typically, a server is more powerful than a client in that it has greater processing power or larger amounts of static or dynamic memory.
Much effort is spent on understanding Internet usage, and determining what units of measure are appropriate for metering client visits. A problem of equal, if not greater, concern, however, is that of ensuring the security of the process by which access data is collected and transferred.
The absence of secure metering greatly impacts advertising revenues. Access data is usually collected at the server site, which has control over the collecting process as well as over stored data. Since the owner of the server can charge higher rates for advertisments by showing a higher number of visits, the owner has a strong economic incentive to inflate the number of visits. The owner could accomplish this by manipulating any unsecured metered data stored on the server.
Alternatively, any individual could fraudulently increase the number of visits to a web site using a "robot." As used in reference to the Internet, a "robot" is a computer program which is configured to generate visits to a web site. The amount of visits is theoretically limitless. Further, these visits could be made untraceable. A robot's creator could even make it appear that the visits are from a diverse group of clients.
In view of the above, it is clear that a need exists for accurately and securely metering vists to a web site. Conventional schemes, however, are unsatisfactory in many ways.
For example, one possible solution is to employ metering methods typically used for traditional mass-distribution media, such as radio broadcasts, television and newspapers. These conventional metering methods, however, do not translate easily on the WWW. For example, customer surveys such as the Nielsen ratings have served the television industry for decades. Polling customers about web site visits, however, poses severe difficulties given the vast amount and fast-changing nature of information offered on the WWW. As another example, daily newspapers effectively track circulation by charging customers for each newspaper. In the Internet realm, however, this is far less effective since customers have historically expected free access to information on the Internet.
Another possible solution is to employ standard cryptographic methods to keep self-authenticating records of interactions on the WWW secure. This could be accomplished using existing extensions to the WWW protocols, such as secure hypertext transfer protocol (S-HTTP) and secure socket layer (SSL) protocol. Secure HTTP is an extension of HTTP providing security service for transaction authenticity, integrity and non-repudiability of origin. The SSL protocol is a security protocol for the Internet that mandates server's authentication, allows optional client authentication, and provides services for private communications between clients and servers. The problem with these methods, however, is that they require authentication of all clients. Every client must register to obtain authentication keys. Not only is this a heavy administrative burden, but it leads to solutions that threaten the client's privacy.
A third possible solution is to meter visits to websites using an online third party census. Audit Bureau of Circulations (ABC) offers such a service. A third party census can independently provide measurements on web activity. Using this scheme, an objective authority can monitor metering activity. This authority can then certify the measured data from such activity. This certification minimizes the possibility of manipulation by a self-serving web publisher. A problem with this method, however, is the dependence on a central authority. If the central authority is compromised, the results of the audit will be suspect. Moreover, an online third party census incorporates the inaccuracy inherent in any census method, thereby making it difficult to determine the deviation of census activity from real activity, i.e., how much fraud has actually occurred.
Another deficiency of conventional metering schemes is that they fail to solve the proxy problem. The proxy problem results from the failure to accurately meter the number of visits to web pages from a particular web site which have been temporarily stored in a cache or on proxy servers.
A cache is used to temporarily store hypertext files to minimize conection time to a web server. In a typical WWW transaction, the client computer connects to a server computer and requests certain information stored on the server. The server in turn downloads the information to the client computer's local memory or a cache. Once the client receives the requested information, the client disconnects from the server. The customer using the client computer may then review the information at his or her leisure.
A similar problem occurs with proxy servers. A company may want to isolate their local network from external computers and networks for security purposes or to prevent the infection of the company network from viruses. The company may accomplish this by building a "firewall." The company would route all connections to computers outside the company's network through a central site. Any information downloaded from outside the company network would be stored at a central server, sometimes referred to as a proxy server. In turn, various client computers located on the company's network can access the information from the proxy server, without having to actually connect to the external server which originally provided the information.
Hypertext files stored in a cache or proxy server can be visited any number of times by any number of client computers without the server's knowledge. Conventional metering schemes are incapable of metering these visits, except through extrapolating an estimated amount from metered visits. The number of proxy visits, however, can account for a significant percentage of the overall number of visits to the content from a particular web site.
Metering visits to websites is also necessary for commercial applications other than advertising. For example, customers may choose to connect to the WWW through an Internet Service Provider (ISP). ISPs typically charge their customers for connections to the WWW by the hour. An ISP may enhance service to customers by having sites referred to as partners that provide information on the WWW and assume the charges of clients connections to such sites. This would be similar to companies which assume the phone charges of calls to "800 numbers" in the United States. A secure metering scheme would enable ISPs to accurately record the number and duration of client visits that pass through the ISP to the partner site, and use this record to reverse-charge the partner site.
Another example of the need for a secure metering scheme can be illustrated with respect to copyrighted material stored on the Internet. Some web sites post copyrighted material, such as art work or music excerpts, and are therefore required to pay royalties to the copyright owner(s). A secure metering scheme would prevent copyright owners from fraudulently inflating visits to the web site in an attempt to increase their own royalty payments.
A secure and accurate metering scheme is also important for "popularity" polls. There are a number of web sites which rank other web sites according to the number of visits to each site. These popularity polls are important for marketing a web site, and also for pricing advertisments for a site. A secure and accurate metering scheme would ensure proper rankings. Further, the rankings would be far more accurate if they could reflect not only the number of visits, but the duration of each visit as well. Current metering schemes fail to accurately and securely accomplish either function.
In view of the foregoing, it can be appreciated that there exists a substantial need for a secure and accurate metering apparatus and method to solve the above-discussed problems.