With the growth of the World Wide Web, an increasingly large fraction of available bandwidth on the Internet is used to transfer Web documents. Often, a Web document is formed of a number of files, such as text files, image files, audio files and video files. When an end-user at a requesting device, such as a personal computer, designates a particular Web document, a request may be made to an originating server to download the corresponding file. The total latency in downloading the requested file depends upon a number of factors, including the transmission speeds of communication links between the requesting device and the originating server in which the requested file is stored, delays that are incurred at the originating server in accessing the file, and delays incurred at any device located between the requesting device and the originating server.
One approach to reducing the total latency in downloading the requested file is the use of proxy servers. Proxy servers function as intermediaries between browsers at the end-user side of an Internet connection and the originating servers at the opposite side. An important benefit provided by the proxy server is its ability to cache frequently requested files, so that the need to continuously retrieve the same requested files is eliminated.
While caching is beneficial to the end-users, a concern is that it offsets inability of a Web site administrator of the originating server to accurately count the number of hits for the requested file, since at least some of the requests may be intercepted and serviced by the proxy server. A “hit” is an instance of accessing a network file, which may be temporary in nature, such as a “visit” to a Web site, or which may be more permanent in nature, such as a download of an executable file. There are advantages to enabling a Web site administrator to accurately count the number of hits for a particular file. For example, an accurate count may determine a popularity level of the Web site, so that the Web site administrator can determine how much to charge advertisers to present commercial banners that are displayed with each visit to the site.
U.S. Pat. No. 5,935,207 to Logue et al. describes a method for counting a number of hits made to a proxy server. According to the method, a hit is recorded for every request that is satisfied by a transfer of one or more cached files from an accessible proxy server (i.e., proxy server in which the Web site administrator has access to a hit report for the number of hits made for the requested file), if the requested file was pre-selected for tracking. There may be many of these accessible proxy servers. The total number of hits for the requested file is reported to the administrator when a request to report is made by the administrator to the accessible proxy server. While the Logue et al. method works well for its intended purpose, a concern is that the method loses its ability to accurately count the total number of hits for the requested file, since the administrator does not have access to a hit report for requests made to non-accessible proxy servers. As an example, the non-accessible proxy server may be located in a local-area network between multiple end-users and the accessible proxy server. The end-users may make multiple requests for a same requested file that is downloaded from the accessible proxy server to the non-accessible proxy server, and finally, to the end-users. While there are multiple hits for the same request, only the hit by the accessible proxy server is recorded, since the administrator does not have access to a hit report for requests satisfied by the non-accessible proxy server. Consequently, the number of hits for the requested file is inaccurate.
Another concern is that a same request made by one end-user may be counted more than once if the request is reported by more than one accessible proxy servers. As an example, an end-user may make a single request for a file that is downloaded from a first accessible proxy server to a second accessible proxy, and finally, to the end-user. During the hit reporting process, the same request may be reported twice if a hit is reported by the first accessible proxy server and another hit is reported by the second accessible proxy server.
What is needed is a method and system to accurately count the total number of hits for requested files made over a network.