Due to the rapid growth of the World Wide Web, many web sites are experiencing overload conditions, which leads to very slow response times for servicing incoming user requests. Increased network congestion amplifies this performance degradation as seen by the clients. In order to accommodate the increasing request load, many web sites have been implemented in the form of a cluster or "pool" of replicated web servers. The network congestion problem is often alleviated by geographically distributing the replicated servers of the server pool. FIG. 1 shows a distributed server computer system 10 in which a number of clients 12-i communicate with servers 14-j over a communication network 16 such as the Internet. The servers 14-j in the system 10 represent a pool of replicated servers corresponding to a particular web site. In this illustration, the server pool includes five distinct servers 14-1 through 14-5 arranged as shown. The servers 14-1, 14-2 and 14-3 form a geographically co-located group 18, while the servers 14-4 and 14-5 are geographically distributed relative to each other and the group 18. This type of distributed server architecture can result in increased service availability in times of high network congestion, and can improve performance by taking advantage of the proximity of clients to particular servers in the pool.
A system with a pool of replicated servers such as that shown in FIG. 1 generally requires a mechanism for dispatching each incoming client request to an appropriate server in the pool. Many current web server systems implement a server-side dispatching mechanism which requires modification to web server code or the domain name server (DNS) system, or even specialized server-side hardware. FIG. 2 illustrates a typical server-side solution which may be implemented in the system 10. Incoming requests from the clients 12-i are supplied via Internet 16 to a dispatching mechanism 20 that resides at the site of the servers 14-1, 14-2 and 14-3 in the geographically co-located group 18. The dispatching mechanism 20 routes the requests to one of the servers in group 18 in accordance with a technique which attempts to provide an optimal distribution of the request load across the servers.
A significant problem with server-side dispatching of the type illustrated in FIG. 2 is that the dispatching mechanism itself can become a performance bottleneck as the client request load increases. Such a bottleneck cannot be addressed by simply increasing the number of replicated servers in the pool. Another significant problem is that conventional server-side techniques do not provide optimal performance in applications in which the servers are geographically distributed. Unfortunately, conventional client-side techniques have also been unable to provide adequate solutions to these problems. For example, a client-side approach described in C. Yoshikawa et al., "Using Smart Clients to Build Scalable Services," USENIX 1997 Annual Technical Conference, Jan. 6-10, 1997, Anaheim, Calif., pp. 105-117, uses a modified web browser to perform routing decisions at the client side. The browser downloads an applet which a service provider needs to implement to realize service-specific routing. Although this approach may alleviate the potential bottleneck of a server-side dispatching mechanism in certain applications, it can also create increased network congestion due to applet transmission and potential control messages between the applet and the servers. Another problem with conventional client-side techniques is that such techniques generally cannot achieve load balancing at the server site, i.e., cannot provide an optimal distribution of request load across the servers of a server pool.