With the acceptance and growth in deployment of web technology, the overall complexity of managing content, networks, and applications is expanding rapidly. There is an ever-growing breadth of devices to manage and content/applications to deploy as businesses look to leverage the expanding web market. In addition, while business use of the Internet started out conservatively, it is rapidly growing into a sophisticated array of e-commerce and content-personalization applications for consumers and businesses alike. The Internet has created a new medium for commerce, one that allows a widespread group of customers to find products and services that are of interest to them. The medium has created a tremendous demand for creative web services to enable advertising, distribution of information, e-commerce, and online transactions of various kinds.
Businesses using the web are developing new models to handle the volumes of web traffic that is created from these new services. These models are typically provided by web servers accessed via web browsers (e.g., Netscape, Explorer). Web switches are being used to help businesses and other content providers serve the needs of their clients. These switches delve deep into the network packets to determine not just what destination was intended, but also what application is being run, and what kind of transaction is being requested within the application. This information can then be used to make intelligent decisions about how to forward this traffic.
As Internet sites begin to handle more traffic and support more services, availability and fault tolerance becomes a critical need. Every transaction and user interaction must be reliable to maintain optimal server quality of service. To address these needs and prevent overload to one specific server, sites often replicate data across an array of servers, or a server farm. But as more servers are deployed it becomes costly, difficult to manage, and provide assurance that one server will not become overloaded, provide incorrect responses, or outright fail. This has created the need for more intelligent systems that can manage incoming traffic—a function known as load balancing (see V. Cardellini, M. Colajanni, and P. S. Yu, “Dynamic Load Balancing on Web-Server Systems,” IEEE Internet Computing, pp. 28-39, May/June 1999; A. Iyengar, J. Challenger, D. Dias, and P. Dantzig, “High-Performance Web Site Design Techniques,” IEEE Internet Computing, pp. 17-26, March/April 2000; T. Schroeder, S. Goddard, and B. Ramamurthy, “Scalable Web Server Clustering Technologies,” IEEE Network, pp. 38-44, May/June 2000; and H. Bryhni, E. Klovning, and O. Kure, “A Comparison of Load Balancing Techniques for Scalable Web Servers,” IEEE Network, pp. 58-64, July/August 2000). In this type of scenario, traffic can be dynamically distributed across a group of servers running a common application, while making the group appear as one server to the network. This approach allows the traffic to be distributed more efficiently, offering greater economies of scale, and providing significantly greater fault tolerance. A distributed web server system may also provide better reliability since appropriate load balancing algorithms can facilitate fault resilience with graceful degradation of performance as servers leave the system due to failure or preventive maintenance. A distributed web server system also makes it possible to add new machines without interrupting service. Load balancing systems monitor the health of these servers and make decisions on where to route traffic to optimize performance and availability. This ensures users will be connected to the most available server, providing excellent and predictable quality of service to the end-user.
Service interruptions can be costly with today's web applications, and can occur in many ways. Hardware and software failures are common, and operating system and applications may simply stop responding. Content failure (e.g., Object Not Found) or incorrect data can be infuriating to users. And finally, heavy traffic and network and/or server congestion/failure can easily limit site availability. Load balancing systems must be designed to guarantee availability despite these interruptions. Using a solution that is not geared toward providing high availability does not maximize the return on investment for Internet and Intranet connectivity and server system infrastructure.
The techniques traditionally used for load balancing of web servers are mainly round-robin based schemes and have a shortcoming of the inability to adjust to actual resource usage at the web servers. A round-robin algorithm rotates through a list of several server addresses, any one of which could be mapped to a client request. Because such a round-robin algorithm distributes traffic to servers in a predetermined cyclical pattern, it treats all servers as equal, regardless of the number of connections or the response times of the servers. This method for load balancing has several limitations in a server farm made of multiple servers of different capacities. There is a level of system bias resulting from the rotation, which creates unequal and highly variable load distribution among individual servers. The result is that traffic is not being sent to the server that could most efficiently handle the load. A round-robin algorithm also presents an availability problem because this method has no knowledge of the status of the server, software, or application. It does not take into account the workload on the servers, resulting in hot spots. Also, it has no awareness of the availability of the servers. If a server crashes or is removed, a round-robin algorithm continues to send client requests to that server and clients receive a “server not available” message.
A weighted round-robin load balancing scheme is similar to the aforementioned round-robin scheme, but each server in the application group using a weighted round-robin algorithm is assigned a static weight based on some view of the capacity of each server. Servers are presented client requests in proportion to their weighting.
With an ineffective load, balancing scheme, load imbalances among web servers can cause local overloads even when the system has available capacity. Lower performing servers receive excessive requests while higher performance servers are underutilized. The possibility of more frequent software and hardware upgrades in a distributed web server system implies that load control must function in a continuously changing environment. As discussed above, performance and high availability have become critical at web sites that receive large number of client requests.
Because of the above limitations of the traditional load balancing methods, newer techniques need to be implemented to not only solve the load balancing issue associated with the round-robin schemes, but also to provide more scalable and higher availability solutions while providing mechanisms for server management. Thus, it would be desirable to provide a technique for adaptively distributing a web server request in a system having a plurality of web servers which overcomes the above-described inadequacies and shortcomings of the traditional load balancing methods.