1. Field of the Invention
The present invention relates to services of communications networks and, more specifically, to a system and method for increasing the availability of services offered by a service provider of a communications network.
2. Background Information
It is increasingly common for users having standalone computers, or computers interconnected by an institutional intranet or local area network, to gain access to various remote sites (such as those on the “World Wide Web”) via the well-known Internet communications network. Using resident web browser applications executing on the computers, these clients may navigate among services (“pages”) stored on various servers of a service provider (“web site”) and may further request these services as desired. In a basic network communication arrangement, clients are free to access any remote web site for which uniform resource locators (URLs) are available.
It is also increasingly common in network applications to provide the web site servers with associated proxy cache servers that link (“front-end”) the servers with the Internet. A proxy cache server (“proxy”) may be used to accelerate client access to the Internet (“forward proxy”), to accelerate Internet access to a web server (“reverse proxy”), or to accelerate Internet access transparently to either client access or web server access (“transparent proxy”). As for the latter reverse proxy environment, the proxy may access frequently requested services from the web servers and store (“host”) them locally to effectively speed-up access to future requests for the services. For instance, a proxy may host frequently requested web pages of a web site. In response to a request from a browser executing on a client, the proxy attempts to fulfill that request from its local storage. If it cannot, the proxy forwards the request to a web site server that can satisfy the request. The web server then responds by transferring a stream of information to the is proxy, which stores and forwards the information over the Internet onto the client. The illustrative embodiment of the invention described herein is applicable to a proxy environment
As Internet traffic to the web site increases, the network infrastructure of the service provider may become strained attempting to keep up with the increased traffic. In order to satisfy such demand, the service provider may increase the number of network addresses for a particular service by providing additional web servers and/or associated proxies. These network addresses are typically Transmission Control Protocol/Internet Protocol (TCP/IP) addresses that are represented by URLs or wordtext (domain) names and that are published in a directory service, such as the well-known Domain Name System (DNS). Computers referred to as name servers implement DNS by mapping between the domain names and TCP/IP address(es).
In the case of a “reverse proxy,” the proxies “front-end” the web servers (and may, in fact, be resident on the web servers) and the network addresses of the proxies (rather than the actual web site) are generally mapped to the domain name of the service provider. As a result, communication exchanges with the proxies generally comprise IP packets or UDP/TCP-socketed traffic, such as socket requests and responses. A socket is essentially an interface between an application layer and transport layer of a protocol stack that enables the transport layer to identify which application it must communicate with in the application layer. For example, a socket interfaces to a TCP/IP protocol stack via a set of application programming interfaces (API) consisting of a plurality of entry points into that stack. Applications that require TCP/IP connectivity typically utilize the socket API to interface into the TCP/IP stack.
For a connection-oriented protocol such as TCP, the socket may be considered a session; however, for a connectionless protocol such as IP datagram using the User Datagram Protocol (UDP), the socket is an entity/handle that the networking software (protocol stack) uses to uniquely identify an application layer end point, typically through the use of port numbers. The software entity within the server that manages the communication exchanges is a TCP/IP process, which is schematically illustrated as layers of a typical Internet communications protocol stack. Protocol stacks and the TCP/IP reference model are well-known and are, for example, described in Computer Networks by Andrew S. Tanenbaum, printed by Prentice Hall PTR, Upper Saddle River, N.J., 1996.
The popularity of Internet caching, due to the increased efficiencies it offers, has led to the construction of caching architectures that link multiple proxy cache servers in a single site. The servers may service an even larger cluster of content servers. The proxy cache servers are sometimes accessed via a Layer 4 (LA) switch that is tied to the site's edge router. The L4 switch is generally responsible for load-balancing functions across the cache. By load-balancing it is meant that the switch assigns requests to various caches based upon a mechanism that attempts to balance the usage of the caches so that no single cache is over-utilized while taking into account any connection context associated with the client requesting the dataflow.
Requests by users to various sites for large files have grown significantly. One particular class of often-requested files is feature-length movies stored using MPEG-2 compression or another standard. These files may easily exceed 1 gigabyte of storage. Moreover, certain movies in a site (highly popular releases) may be accessed continuously in a given period. Problems specific to the vending of large files have been observed. Conventional caches have become severely congested because of the time required to vend a file to a remote user/client. Part of this problem results from the low bandwidth of many clients (often using slower connections) that tends to tie up the server for hours on the vending of a single large file, and the fact that certain files are being vended much more often than others. Specifically, when a large proportion of the cache is being devoted to large files (over 50 Kbytes) and many requests are being made for the same file, the cache structure will become quickly filled with these large files. The problem is compounded by the load-balancing function of an L4 switch because all caches tend to be populated with the same large files. This loading of all caches occurs, in part, because the L4 switch is designed to provide load-balancing across all caches in the cluster. Accordingly, the behavior of the switch causes further copies of the file across each of the existing proxies. Since a large number of requests for a given file may be present at any one time, the L4 switch will naturally fill all caches with copies of the file generally without regard to how large the cache cluster (number of servers) is made. While it may be possible to cache a single copy to a designated cache, and vend it to clients all at once (via multicast techniques, and the like) within a given time frame, this may involve unacceptable delays for certain clients. The result of the filling of all caches for a long period of time is that other content is delayed due to the resulting congestion.
One solution to the problem of congestion is to address-partition the storage of files in the cache. L4 switches typically include functions for selectively routing requests to a particular cache without regard to the load-balancing imperative based upon a specific cache location for the file. This would ensure that other caches remain free of the over-requested file. However, this may involve the use of internal hashing functions with respect to the requested file's URL, that ties up processor time, thus delaying the switch's load-balancing operations. As caching arrangements become increasingly efficient, this often undesirably slows the system. In addition, L4 switches often cannot distinguish between large and small files. Where files are small (example 50 Kbytes or less) it is generally more efficient to allow load-balancing unmodified by the invention to occur, regardless of the number of request for a given file.
It is an object of the present invention to provide a technique for address-partitioning a proxy cache cluster and associated proxy partition cache (PPC) that enables address partitioning at the proxy cache at the cache situs without an external load-balancing mechanism, thereby freeing the L4 switch from any additional address partitioning responsibilities. The PPC architecture should thereby relieve congestion, and overfilling of the caches with duplicate copies of large files.