Load balancing techniques exist to ensure that individual servers in multi-server systems do not become overloaded and that services retain high availability. Load balancing is especially important where it is difficult to predict the number and timing of requests that will require processing.
Most current load-balancing schemes employ simple parameters to distribute network traffic across a group of servers. These parameters are usually limited to load amount (measured by the number of received requests), server “health” or hardware status (measured by processor temperature or functioning random access memory), and server availability.
One common load-balancing architecture employs a supervisor/subordinate approach. In this architecture, a control hierarchy of devices is established in a load-balancing domain. Each server in the system is assigned to a load-balancing group that includes a central device for monitoring the status of servers in its group. The supervisor acts as the gatekeeper for requests entering the group and delegates each request to an appropriate server based on the server's relative status to that of other servers in the group.
One negative aspect of this approach is that it introduces a single point of failure into the load-balancing process. If the supervisor goes offline for any reason, incoming requests cannot be serviced. To ameliorate this problem, some load-balancing schemes employ a secondary supervisor to handle requests when the primary supervisor is unavailable. A secondary supervisor, however, introduces extra cost in terms of physical equipment and administration.
One of the earliest forms of load balancing, popular in the early 1990's, is commonly referred to as domain name service (DNS) round robin. This load-balancing scheme, described in connection with FIG. 1, represents an extension of the standard domain name resolution technique primarily used by Internet Web servers experiencing extremely high usage.
As shown in FIG. 1, in step 110, a client requests data from a DNS server. In step 120, the domain name server resolves the requested server name into a series of server addresses. Each address in the series corresponds to a server belonging to a single load-balancing group. Each server in the group is provided with a copy of all data to be served, so that each server replicates data stored by every other server in the group.
In step 130, the domain name server assigns new requests by stepping through the list of server addresses, resulting in a crude and unpredictable load distribution for servers in the load-balancing group. Moreover, if the number of requests overloads the domain name server or if the server selected to service the request is at capacity, the service is ungracefully denied. In addition, if the selected server is at capacity, the new request routed by the domain name server may bring the server down.
Another major problem with DNS round robin is that the domain name server has no knowledge of server availability within the load-balancing group. If a server in the group is down, DNS round robin will nevertheless direct traffic to it.
In the mid 1990's, second generation load-balancing solutions were released. These solutions employed a dedicated load balance director (LBD), such as Cisco Systems' LocalDirector. The director improves the DNS round robin load-balancing scheme by periodically testing the network port connections of each server in its group and directing responses to responsive servers. One such second generation solution is discussed in “Load Balancing: A Multifaceted Solution for Improving Server Availability” (1998 Cisco Systems, Inc., which is hereby incorporated by reference.
A third generation of load-balancing solutions included robust, dedicated load balancing and network management devices, such as the BIG-IP™ from F5 NETWORKS™ These devices improve server availability by monitoring server health via management protocols such as Simple Network Management Protocol (SNMP). Perhaps the biggest improvement of this generation is the ability to direct traffic based on requested content type instead of just load. For example, requests ending in “.http” are directed to Web servers, “.ftp” to file download servers, and “.ram” to REALNETWORKS'™ streaming servers. This feature enables network managers to create multiple load-balancing groups dedicated to specific content types.
Although the aforementioned load-balancing techniques are often adequate for managing multi-server systems that serve Web pages, file downloads, databases, and email, they still leave room for significant improvement. Moreover, such load-balancing schemes do not perform well in systems that serve broadcast-quality digital content, which is both time sensitive and bandwidth intensive.