The Internet, and in particular, the World Wide Web (WWW or web), is becoming an integral part of modern life. Unfortunately, the growth of the web places ever-increasing demands on the network backbone and other facilities that form the web. Web traffic has been growing at a much faster pace than available bandwidth, often causing substantial latency between user request for content and user receipt of that content. In many cases, this latency results from network congestion caused by numerous requests for transmission of the same content. Such activities can overload (and in some cases, disable) web servers and other network facilities. At a minimum, multiple requests for the same material from a web server increase delays experienced by web users.
Web caching offers potential relief to overloaded networks. As is known in the art, web caching is a technique of storing popular web content at, and providing that stored content to end users from, locations in addition to the web servers that initially provide that content. By making copies of web pages and other content available from alternate locations, the load upon the origin servers that initially provide the content is reduced, substantially reducing latency. Web caching also helps transfer load from the Internet backbone to smaller networks. By storing frequently requested web content at one or more web cache servers located at network edge(s), future local requests for that content can be served from the web cache(s) instead of repeatedly obtaining content from the origin servers. This reduces Internet traffic, and may also reduce load upon Wide Area Networks (WANs) and other networks that are linked by (or to) the Internet. Load on origin web servers is reduced because those origin servers service fewer requests.
Web caches may be deployed in numerous and varied configurations. FIGS. 1 and 2 represent only a few examples. Both FIG. 1 and FIG. 2 illustrate deployment scenarios in which the existence of the web server is not apparent to the end user/client. It is possible that no manual or automatic configuration of client web browser software is needed to access the web cache (although the web cache may only serve users within a specific network), and the user may perceive no difference between content requests served by a web cache vs. content requests served by an origin server. FIG. 1 illustrates a typical web cache deployed at a network edge. In this scenario, clients on a local network send HTTP (Hypertext Transfer Protocol) requests to origin servers on the Internet. These requests may be forwarded by a local network router within the local network to a switch. That switch may have Layer 4 (transport layer) or Layer 7 (application layer) capability, and thus be able to identify HTTP traffic.
For example, a Layer 4 switch might identify HTTP traffic by checking the TCP (Transmission Control Protocol) port number of incoming IP (Internet Protocol) packets. If the destination port number is 80 (default HTTP server port number), the packet is forwarded to the cache. Otherwise, the packet could be forwarded to the WAN Router. The cache then intercepts the TCP connection from the client and obtains the URL (Universal Resource Locator) for the desired Web pages or other content. A Layer 7 switch (also known as a content switch or web switch) may replace the Layer 4 switch to provide additional functionality. For example, TCP connections from clients may be intercepted by a Layer 7 switch instead of the cache, and the Layer 7 switch might make routing decisions based on the URL. In either event, a switch identifies HTTP traffic and forwards that traffic to the cache. If the content requested by the client is stored in the cache, that content is provided to the client from the cache. Otherwise, the cache fetches the content from an origin server or other location, and serves the content to the requesting client.
FIG. 2 illustrates a typical reverse proxy scenario where web caches are used to relieve the load upon web servers. Incoming requests are intercepted by a Layer 7 switch. Based on how the reverse proxy is configured, either a cache or server is selected to serve the request. For example, frequently changing content may generally be served by a web server, and relatively unchanging content served by a web cache. Because the cost of a web cache is typically much lower than the cost of a web server, deploying web caches to serve popular static content provides an economic and scalable server farm solution.
In both scenarios shown by FIGS. 1 and 2, as well as in other scenarios, web caching improves user experience and relieves load on origin servers. If deployed at a network edge, web caching can also provide substantial cost savings in terms of backbone bandwidth. Other aspects of web caching may undercut these benefits, however. In a steady state, a web cache optimally operates at full (or near-full) storage capacity. Accordingly, before a new object may be stored in the cache, one or more old objects must be evicted from the cache. Various cache replacement policies have been developed to optimize the eviction process based on measurements such as maximizing Hit Ratio (ratio of requests served by cache to all requests received by cache) or minimizing user perceived latency.
However, web caching has unique characteristics that must be addressed. Unlike caching in a memory hierarchy using fixed-size blocks, web caching must accommodate web objects of widely varying size. Moreover, an overloaded or improperly configured web cache may itself become a network bottleneck and increase latency rather than decrease latency. Typically, web caches store actual content in hard disk drives or in other storage devices that have relatively slow moving mechanical parts. These devices support a relatively limited number of operations per second; these operations include storing new objects as well as accessing stored objects. In other words, time spent storing new objects is generally at the expense of time that might be used to access previously stored objects. Unless the number of disk (or other device) I/O operations are controlled in some manner, the throughput of the cache is not optimized.
To date, there have been limited solutions to these problems. As one example, a Layer 7 switch can be deployed as in FIG. 1, and configured to bypass the cache when the cache becomes overloaded. This approach increases traffic on the network backbone and does not address the underlying cause of cache overload. Multiple hard drives (or even multiple caches) can be deployed in parallel so as to improve total cache throughput, but this solution requires increased hardware investment.
Accordingly, there remains a need for improved methods and systems of managing web cache storage.