Load balancing is a computer networking method for distributing workloads across multiple computing resources, such as computers, a computer cluster, network links, central processing units or disk drives. Load balancing aims to optimize resource use, maximize throughput, minimize response time, and avoid overload of any one of the resources. Using multiple components with load balancing instead of a single component may increase reliability through redundancy. Thus, load balancing is widely used to enhance scalability and availability of a telecommunication and information technology (IT) applications.
In a typical load balancing implementation, a load balancing system generally includes a load distributor implemented in a network element to distribute traffic, and the load distributor is coupled to a number of servers (sometimes referred to as backend servers) in a cluster that processes packets transmitted from clients. The load balancer applies a load balancing policy to determine to which server the packets are to be sent.
The server configuration in a cluster may change over time. Some servers may become unavailable due to maintenance activities; others may be added to enhance the performance of the load balancing. The reconfiguration of the cluster often happens when the servers in the clusters are carrying ongoing traffic.