The traffic on the World Wide Web is increasing exponentially, especially at popular (hot) sites. In addition to growing the capacity of hot sites by clustering nodes at that site, additional, geographically distributed (replicated) sites are often added. Adding geographically distributed sites can provide for both added capacity and disaster recovery. The set of geographically distributed, and replicated, sites are made to appear as one entity to clients, so that the added capacity provided by the set of sites is transparent to clients. This can be provided by an arbiter that assigns clients to sites. In order to support a load that increases close to linearly with the total capacity of the set of sites, it is important that the client load be balanced among the sites. Thus there is a need for methods used by the arbiter for balancing the load among the sites.
One known method in the art that attempts to balance the load among such geographically distributed replicated sites, is known as the Round-Robin Domain Name Server (RR-DNS) approach. The basic domain name server (DNS) method is described in the paper by Mockapetris, P., entitled "Domain Names--Implementation and Specification", RFC 1035, USC Information Sciences Institute, November 1987. In the paper by Katz., E., Butler, M., and McGrath, R., entitled "A Scalable HTTP Server: The NCSA Prototype", Computer Networks and ISDN Systems, Vol. 27, 1994, pp. 68-74, round-robin DNS (RR-DNS) is used to balance the node across a set of web server nodes. In this approach, the set of distributed sites is represented by one URL (e.g. www.hotsite.com); a cluster subdomain for this distributed site is defined with its subdomain name server. This subdomain name server maps client name resolution requests to different IP addresses in the distributed cluster. In this way, subsets of the clients will be pointed to each of the geographically distributed sites. Load balancing support using DNS is also described in the paper by Brisco, T., "DNS Support for Load Balancing", RFC 1974, Rutgers University, April 1995.
A key problem with RR-DNS is It may lead to poor load balance among the distributed sites, See, for example, Dias, D. M., Kish, W., Mukherjee, R., and Tewari, R., "A Scalable and Highly Available Web Server", Proc. 41st IEEE Computer Society Intl. Conf. (COMPCON) 1996, Technologies for the Information Superhighway, pp. 85-92, February 1996. The problem is due to caching of the association between name and IP address at various name servers in the network. Thus, for example, for a period of time (time-to-live) all new clients behind an intermediate name server in the network will be pointed to just one of the sites.
One known method to solve this problem within a local cluster of nodes, i.e., at a single site, uses a so-called TCP router as described in: Attanasio, Clement R. and Smith, Stephen E., "A Virtual Multi-Processor Implemented by an Encapsulated Cluster of Loosely Coupled Computers", IBM Research Report RC 18442, 1992; see also U.S. Pat. No. 5,371,852, issued Dec. 6, 1994, by Attanasio et al., entitled "Method and Apparatus for Making a Cluster of Computers Appear as a Single Host," which are hereby incorporated by reference in their entirety. Here, only the address of the TCP router is given out to clients; the TCP router distributes incoming requests among the nodes in the cluster, either in a round-robin manner, or based on the load on the nodes. As noted, the TCP router method as described in these papers only applies to a local cluster of nodes. More specifically, the TCP router can act as a proxy, where the requests are sent to a selected node, and the responses go back to the TCP router and then to the client. This proxy mode of operation can lead to the router becoming a bottleneck. Also, because of the extra network hops, both for incoming and response packets, it is not suitable for a geographically distributed environment. In another mode of operation, which we will refer to as the forwarding mode, client requests are sent to a selected node, and the responses are sent back to the client directly from the selected node, bypassing the router. In many environments, such as the World Wide Web (WWW) the response packets are typically much larger than the incoming packets from the client; bypassing the router on this response path is thus critical. However, the TCP router method In forwarding mode, only applies to a cluster of nodes that are connected directly to the router by a LAN or a switch, i.e., the nodes in the multi-node cluster cannot be geographically remote, or even on a different sub-net. The reason is that lower level physical routing methods are used to accomplish this method.
Thus there is a need to provide a method for better load balancing among geographically distributed sites.