Today, improving the performance of Internet communication is of major technological and commercial concern. Investment in improving the Internet network infrastructure is estimated to become a $1.3 trillion dollar industry by 2003 (Source: Nortel Industries press release Jan. 31, 2000). It has been estimated that many web users will not tolerate a delay in downloading a web page of more than about 8 seconds, and that the current value of e-commerce sales at risk because of slow download speeds is $4.35 billion per year. (Source: Zona Research report “The need for speed”, abstract available http://www.zonaresearch.com/info/press/99-jun30.htm). In this climate there is pressing demand for ways to improve web performance, and no simple or obvious techniques are overlooked.
The invention presented here is a technique that offers significant Internet performance gains, can be deployed at relatively modest cost in less than a year, and needs neither major infrastructure changes nor changes to end-user software.
Content Distribution and Caching
A Content Distribution (CD) network is a collection of specialized nodes or devices, placed in a larger network such as the Internet at chosen locations such as in the offices of Internet Service Providers (ISPs). These nodes store certain web content on behalf of the content redistributors' customers. Such stores are sometimes caches, mirrors, or repeaters. We use former term throughout the following.
The second technological aspect to a Content Distribution service is a redirection or interception service. When a web user (using a client such as a browser) asks to be sent content from a site, and the content is known or suspected to be cached at one or more CD nodes, the request is directed (or comes to be redirected) to some such CD node that is “close” to the user. The notion of closeness used varies, and in particular can use such metrics as bandwidth capacity, bandwidth cost, latency, security, administrative boundaries, administrative convenience, and current congestion on various network paths. The technologies for choosing a close CD node and then directing requests to the chosen node are varied, but the field is still new and there is still considerable ongoing innovation. Redirection comprises both the processes of determining a close node that has (or is likely to have) the cached content, and of ensuring that the request is actually directed to the chosen node.
An alternative to redirection is interception, in which a node is placed in the network path from the client in such a way that it gets to see all web traffic from the client. A web proxy or other specialized device such as a router, for instance located at the client's ISP, can be used for this purpose. In this case, the node intercepts all traffic and if it sees a request for content it has cached (or can readily fetch from a nearby cache) then it can return the content immediately, but otherwise it relays the traffic to its destination unchanged.
The advantages of Content Distribution are the possibility of serving traffic to the user from a close CD node, thus getting the response to him faster, cheaper, with less bandwidth, and perhaps more reliably. It is common to see reports of up to 10× improvement in the speed at which content is served to the end user.
The major disadvantage of Content Distribution is that it is inappropriate for dynamically generated content or rapidly changing content. A CD node typically stores images, videos, and static text pages that do not change much from user to user. Such content is kept on the CD server in anticipation of requests for it (or perhaps, if there has already been one request, in anticipation of further requests from another user). However, much content on the Internet is generated on-the-fly in response to a customer's request; for instance, generated by a server program using the Common Gateway Interface (i.e. a “cgi-bin” program). Since the output of such a program may never be the same twice, or at least be likely to differ from person to person, it is generally not feasible to prepare in advance. (There are, after all, hundreds of millions of web users; one could not generate and store this many customized pages in advance.) As web content becomes more personal and more customized to each user the importance of such pages will increase further. CD networks cannot anticipate such pages and so generally cannot improve the speed at which they are served.
A second disadvantage arises since even content usually described as being “static”, such as images and fixed text, may be subject to occasional change. It is important to ensure that the caching node or Content Distribution node not serve “stale” content (i.e. content that is no longer in agreement with the definitive copy on the origin server.) A variety of schemes are used to ensure that content is fresh, or to lower the probability of delivering stale content. The mature field of caching technology addresses such issues. However, by the nature of the problem then there is no perfect solution to this problem. To illustrate the issues, consider that the most recent version of the web protocol, HyperText Transfer Protocol Version 1.1, includes support for caching and Content Distribution that works as follows. A node with a cache can send a short message to the origin server asking, in effect, whether the copy of a web object held by the cache is still up to date. If so, a short acknowledgement is return to the cache. If the cache node or CD node always makes such an inquiry before delivering content to a client then there is no chance of delivering stale content. But there is a delay, possibly large, as the message is sent to the origin server and the response is received. This scheme may reduce the volume of traffic sent over the network (bandwidth consumption) but does not necessarily reduce the delay before the content is seen by the client (latency). Such tradeoffs are inherent to any caching or Content Distribution technology.
A third disadvantage of caching and Content Distribution technology is that it requires significant computer resources, since a cache keeps copies of web content just in case a client will request them. A cache may keep many objects that do not, in fact, get requested by a client before they become stale, and these consume expensive resources such as memory or disk space. The problem is made worse since a typical Content Distribution network has numerous caching nodes. There are many techniques that reduce this problem, e.g. by carefully distribution of cached content across a network of multiple caching nodes. However, the expense of resources is inherent to the technology and can only be reduced, but not eliminated, by such techniques.
The term “caching” (or “proxy caching”) is sometimes used to refer to a technique related to Content Distribution. There are only slight technical differences. “Caching” is more often heard when interception technology is used rather than redirection technology. Second, nodes are more likely to be called caches if they are operated on behalf of the clients rather than on behalf of the content originators. In European Patent Application WO9940514 the latter distinction, which is more a commercial distinction rather than a technical one, is regarded as definitive. A related technology is server-side caching (also known as “reverse proxying”) in which a cache node is located near the server rather than near clients. This technology delivers smaller performance gains than conventional caching or Content Distribution, but comes at a reduced resource cost because only one such node is needed.
All forms of caching share the first two of the disadvantages of Content Distribution described above, most critically being the inability to handle dynamically generated content.
HTTP and TCP
HyperText Transport Protocol (HTTP) is the application-level network protocol used when a client requests web content from a web server, and used by the web server when it responds to such requests. Modern network communication is layered, which means that higher-level protocols build on top of lower-level protocols (which in turn may build on other protocols). HTTP is a high-level protocol, which includes commands to request content, respond with content, negotiate the form in which content is sent, and so forth. It is generally carried over a lower-level protocol Transmission Control Protocol or TCP. TCP enables reliable end-to-end connectivity between two locations in the Internet, but does not interpret the content sent between these two locations in any way: it just carries a stream of bytes. TCP in turn is generally carried over Internet Protocol (IP), which is a packet-oriented protocol that does not guarantee reliable delivery.
TCP was developed and deployed well before HTTP we developed, and was designed for bulk bi-directional data transfer. HTTP is characterized by short request messages and moderate-length response transactions, and very bursty traffic. That TCP is not an optimal protocol for carrying HTTP is well known and extensively documented.
The paper “Modeling the Performance of HTTP Over Several Transport Protocols” in IEEE/ACM Transactions on Networking, vol. 5, number 5, October 1997, by Heidemann, Obraczka, and Touch, is representative of research addressing these issues. They claim, for instance,
These mismatches between the needs of HTTP and the services provided by TCP contribute to increased latency for most web users. Fundamentally, TCP is optimized for large-scale bulk data transport, while HTTP often needs a lightweight, request-response protocol.
The mismatches referred to here relate to a number of technical features of TCP, including those known as “three way handshake”, “slow-start congestion avoidance”, and “TIME_WAIT tear-down delays”. The Heidemann, Obraczka and Touch paper discusses several such improved protocols, such as Transaction TCP (T/TCP) and Asynchronous Reliable Delivery Protocol (ARDP). Other defects in TCP as it relates to HTTP include the flow-control algorithm being used, which can lead to unnecessary traffic and delays in the event of noise or error on the network.
HTTP itself is an evolving, improving protocol, but has recognized performance deficiencies even aside from the interrelationship with TCP. The PhD Dissertation “Addressing the Challenges of Web Data Transport” by V. N. Padmanabhan (Computer Science Division, University of California at Berkeley, USA; Also published as Technical Report UCB/CSD-98-1016 September 1998) discusses some of these. As an example, it explains how HTTP Version 1.0 (still in wide use today) requires a client to send one request at a time over a given connection, waiting for the response to arrive completely before continuing, and at considerable performance cost (as the dissertation proves).
To appreciate the invention, it is important to see that even when problems are noticed and improved protocols developed, it may take a long time before such improvements become widely deployed. The delays are particularly long before they reach the public Internet infrastructure. Of course, costs are always large when significant software upgrades are involved. But in the case of protocol upgrades, delays are even longer because no one person can upgrade unilaterally: both ends of a network conversation must be using the same version of the same protocol. In the case of the Internet, a protocol change involves some sort of community-wide coordinated update. For example, the problem with HTTP Version 1.0 cited above from Padmanabhan's dissertation was corrected in the next version of the protocol, where a feature known as “pipelining” was defined, but even several years after this was first suggested there are very few web browsers that adopt it. Similarly, all proposals to replace TCP have languished, and today all major web browsers and web servers support HTTP over TCP only.