The traffic on the World Wide Web is increasing exponentially, especially at popular (hot) sites. In addition to growing the capacity of hot sites by clustering nodes at that site, additional nodes geographically distributed sites are often added. Adding geographically distributed sites can provide for both added capacity and disaster recovery.
The nodes are replicated and made to appear as one entity to clients, so that the added capacity provided by the set of sites is transparent to clients. These replicated sites often include heterogeneous servers with different capacities. An arbiter can be provided that assigns clients to sites.
In order to support a load that increases close to linearly with total capacity of the set of sites, it is important that the client load be balanced among the sites. Thus, there is a need for methods for balancing the load among the sites.
Previous work on load balancing in a multi processor or multiple node environment, such as the IBM S/390 Sysplex, primarily focuses on scheduling algorithms which select one of multiple generic resources for each incoming task or user session. The scheduler controls the scheduling of every incoming task or session and there is no caching of the resource selection.
One method in the art for balancing the load among geographically distributed replicated sites, is known as the Round-Robin Domain Name Server (RR-DNS) approach. The basic domain name server method is described in the paper by Mockapetris, P., entitled "Domain Names--Implementation and Specification," RFC 1035, USC Information Sciences Institute, November 1987. Load balancing support using DNS is also described in the paper by Brisco, T., "DNS Support for Load Balancing," RFC 1794, Rutgers University, April 1995. In the paper by Katz., E., Butler, M., and McGrath, R., entitled "A Scalable HTTP Server: The NCSA Prototype", Computer Networks and ISDN Systems, Vol. 27, 1994, pp. 68-74, the RR-DNS method is used to balance the node across a set of web server nodes. Here, the set of distributed sites is represented by one URL (e.g., www.hotsite.com); a cluster sub-domain for this distributed site is defined with its sub-domain name server. The sub-domain name server maps client name resolution requests to different IP addresses in the distributed cluster. Thus, subsets of the clients will be assigned to each of the replicated sites.
In order to reduce network traffic, a mapping request is not issued for each service request. Instead, the result of the mapping request is saved for a "time-to-live" (TTL) interval. Subsequent requests issued during the TTL interval will follow the result of the previous mapping and hence be routed to the same server node.
A problem with the RR-DNS method is that poor load balance among the distributed sites may result, as described in the paper, Dias, D. M., Kish, W., Mukheijee, R., and Tewari, R., "A Scalable and Highly Available Web Server", Proc. 41st IEEE Computer Society Intl. Conf. (COMPCON) 1996, Technologies for the Information Superhighway, pp. 85-92, February 1996. The problem is caused by caching of the association between name and IP address at various gateways, fire-walls, and domain name-servers in the network. Thus, for the TTL period all new client requests routed through these gateways, fire-walls, and domain name-servers will be assigned to the single site stored in the cache. Those skilled in the art will realize that a simple reduction in the TTL value will not solve the problem. In fact, low TTL values are frequently not accepted by many name servers. More importantly, a simple reduction of TTL value may not reduce a load skew caused by unevenly distributed client request rates.
One method of load balancing within a local cluster of nodes is to use a so-called TCP router as described in: "A Virtual Multi-Processor Implemented by an Encapsulated Cluster of Loosely Coupled Computers," by Attanasio, Clement R. and Smith, Stephen E., IBM Research Report RC 18442, 1992; and U.S. Pat. No. 5,371,852, entitled "Method and Apparatus for Making a Cluster of Computers Appear as a Single Host", issued Dec. 6, 1994 which is hereby incorporated by reference in its entirety. Here, only the address of the TCP router is given out to clients; the TCP router distributes incoming requests among the nodes in the cluster, either in a round-robin manner, or based on the load on the nodes. The TCP router method as described in these papers only applies to a local cluster of nodes.
More recently, in the paper by Colajanni, M., Yu, P., and Dias, D., "Scheduling Algorithms for Distributed Web Servers," IBM Research Report, RC 20680, January 1997, which is hereby incorporated by reference in its entirety, a multi-tier round robin method is proposed to divide the gateways into multiple tiers based on their request rates. Requests from each tier are scheduled separately using a round robin algorithm. This method can also handle a homogeneous distributed server architecture.
In either case, the aforementioned load imbalance which can result from the caching of the association between name and IP address at various gateways, fire-walls, and DNSs in the network remains since the number of data requests following an address request, independent of its origin, is proportional to the TTL value. Thus there is a need for improved methods of load balancing among distributed or clustered sites which may include heterogeneous servers. The present invention addresses such a need.