Despite the phenomenal growth of the Internet, the advances in the means and speed of access to the Internet has not caught up. In particular, the last hop between (a) the Internet site to which the user is connected, such as the user's Internet Service Provider (ISP) and (b) the computer or workstation on is which the user is running a browser application, is still mainly implemented as a connection over a telephone line using a traditional modem, with communication speeds up to only 56 kbps. The emerging cable modem and ADSL technology promise to change that, but their widespread deployment and adoption are still years away.
Separately, the use of wireless communications links in the last hop is gaining popularity. Its growth is fueled by the confluence of 3 factors: (1) the development of digital air interface protocols that support data (e.g., CDPD, IS-95, IS-136, GSM/GPRS); (2) the availability of new classes of portable Internet-capable end devices (e.g., Palm Pilot, Handheld PC, Nokia 9000) and wireless modems (e.g., Novatel Wireless); and (3) the falling usage cost for wireless communications. Again, the raw bandwidth available on most wireless channels is low (e.g., 19.2 kbps for CDPD), which can be further impaired by their multiple-access contention nature and protocol overhead. For example, the effective application layer throughput of CDPD is about 8 kbps without contention.
In a nutshell, Web browsing behind slow (wireline or wireless) access links will persist for years to come. Wireless Internet access, which is emerging only now, will present an even more severe problem.
A number of previous approaches have been suggested in order to reduce the delay incurred in the last hop. Most of these approaches involve increased usage of storage or computation to make up for limitations in the communication medium. This typically amounts to a trade-off, since storage and computational complexity each add overhead and cost to the system. The key to the success of any of these processing techniques is that the increase in processing delay should be more than offset by the decrease in transport delay, thus resulting in a decrease in the overall latency.
One technique, known as "caching", stores earlier responses, and reuses them to satisfy a repeated request. Traditionally, caching algorithms search for identical objects and the search is limited to objects with the same URL. These known compression algorithms do not take into account the presence of other objects in the cache. This topic has been studied extensively in the literature. See for example, Adam Dingle and Tomas Partl, "Web Cache Coherence", Fifth International World Wide Web Conference, May 1997; Bradley M. Duska, David Marwood, and Michael J. Feeley, "The Measured Access Characteristics of World-Wide-Web Client Proxy Caches", USENIX Symposium on Internet Technologies and Systems, Dec 1997; James Gwertzman and Margo Seltzer, "Worldwide Web Cache Consistency", Proceedings of the USENIX Technical Conference, 1996; A. Lutonen, H. F. Nielsen, and T. Berners-Lee, "Cern httpd", In http://www.w3.org/pub/WWW/Daemon/Status.html, July 1996; and Duane Wessels and K. Claffy, "Icp and the Squid Web Cache", IEEE Journal on Selected Areas in Communication, 16(3):345-357, April 1998.
Another technique, known as "prefetching", tries to predict, fetch and store a response before it is needed. In both caching and prefetching, the stored objects must be transferred at least once. In some cases, this transfer can be optimized by first transforming or compressing the objects to a smaller size. Prefetching, like caching, is based on fetching in advance the exact object that will be needed in the future. The utility of prefetching was studied by Venkata N. Padmanabhan and Jeffery C. Mogul, "Using Predictive Prefetching to Improve World Wide Web Latency", Computer Communication Review, ACM, July 1996, using a statistical algorithm described by James Griggioen and Randy Appleton in "The Design, Implementation, and Evaluation of a Predictive Caching File System", Technical Report CS-264-96, Department of Computer Science, University of Kentucky, Lexington, Ky., June 1996.
Compression can be achieved by the use of differential transfer to transmit only changes between current and past information. Usually, only two objects of the same URL are considered at a time. Some of the differencing algorithms used are UNIX diff and vdelta, as described by J. Hunt, K. P. Vo, and W. Tichy, "An Empirical Study of Delta Algorithm", IEEE Software Config. and Maint. Workshop, March 1996; and J. Hunt, K. P. Vo, and W. Tichy, "An Empirical Study of Delta Algorithms", IEEE Software Config. and Maint. Workshop, March 1996; In a paper by Gaurav Banga, Fred Douglis, and Michael Rabinovich, "Optimistic Deltas for WWW Latency Reduction", USENIX, 1997, the difference between two versions of the same page was computed in order to reduce transfer size. Dynamic pages including output of CGI scripts with different parameters, were also considered for differencing. The issue of what objects should be used in differencing has been mentioned in the literature, but is at present considered to be an open question. The benefits of delta coding was also studied by Jeffery C. Mogul, Fred Douglis, Anja Feldmann, and Balachander Krishnamurthy, "Potential Benefits of Delta Encoding and Data Compression for Http", Proceedings of the ACM SIGCOMM, pages 181-194, 1997.
The limits of latency reduction obtainable from caching and prefetching, based on search for objects with same the URL, was studied by Thomas M. Kroeger, Darrell D. E. Long, and Jeffrey C. Mogul, "Exploring the Bounds of Web Latency Reduction from Caching and Prefetching", USENIX Symposium on Internet Technologies and Systems, December 1997. They have found that caching and prefetching can reduce latency by at best 26% and 57%, respectively. Accordingly, there is a significant need for an improved latency reduction technique.