FIELD OF THE INVENTION
The present invention relates to information over a network, and in particular, ways of transferring resources over a network to reduce latency and bandwidth requirements for the network.
There are several standard methods of data transfer that are used to decrease latency and bandwidth requirements on a network. Three examples are caching, compression, and delta encoding. Caching is a way of temporarily storing a resource, and plays a crucial role in a wide-area distributed system such as the World Wide Web. For example, requested web pages can be quickly retrieved from a web page cache rather than the original server, saving time and network traffic. It significantly reduces response time for accessing cached resources by eliminating long-haul transmission delays. Furthermore, caching reduces backbone traffic and the load on content providers. However, caching is of limited value when much of the data on a network is dynamic, such that information is likely to change after the data was last cached. Under these circumstances, the cached information is no longer reliable. Therefore, when a significant portion of network resources are dynamic, they are not cacheable because the resource is freshly generated upon every access. In addition, the content provider might explicitly prohibit caching.
Compression is a way of reducing data size to save space or transmission time by eliminating redundancy. For data transmission, compression can be performed on just the data content or on the entire transmission (data content plus header data). However, the amount a file can be compressed is limited by the redundancy in the file content, typically by 50% for text.
Delta encoding is a way of transmitting encodings of the changes between subsequent versions of a resource. This avoids having to resend the entire resource each time it is changed. Instead, only the changes are sent. This reduces bandwidth requirements and improves end-to-end performance. In order to use delta encoding, a complete version of a resource is cached in a first computer, the one sending the request, but for the first computer to view that cached resource, the cached resource must be compared with another complete version of that resource cached or stored in a second computer, the computer responding to the request. The difference between the two complete resources is then computed, and the older cached resource is updated and then displayed. With delta-encoding, one might cache a resource even if it is considered uncacheable, but not present the cached data without first obtaining the changes to its current version. One advantage of this method is that it applies uniformly to all uncacheable resources regardless of the reason why the resources are uncacheable. Another advantage is that delta encoding can be implemented transparently, via proxies, so that content providers need not be modified. However, the delta-encoding proposals have disadvantages as well. Delta-encoding requires both computers to have a common version base used by the sending computer to compute the delta and by the receiving computer as the basis against which to apply the delta. If the content provider must compute the delta-encodings on the fly, it suffers overhead and must store a potentially large number of past versions; if the delta computation is performed by an intermediary, then the entirety of each version of the resource must be sent from the content provider to the intermediary, and the encoding must still be performed on the "critical path."
With a common class of resources, such as those provided by search engines, a significant part of the resource is essentially static. Portions of the resource vary to different extents from one query response to another (the difference between two pages in response to a single query is usually smaller than the difference between pages from different queries). Also, the location of the dynamic portions relative to the rest of the resource does not change. These resources traditionally have been dealt with in a calculation-intensive and memory wasteful environment.
For example, a technique called Server-Side Includes (SSI) allows one to specify a placeholder on a page where a dynamic content can be inserted. However, this technique is performed on the server in a calculation-intensive way. Another example is Microsoft's Active Server Pages (ASP). This technique includes instructions to pre-process a page. However, this pre-processing again is performed by the server, so bandwidth or server load are not reduced. The present invention seeks to overcome these shortcomings.