Load balancing is a computer networking method for distributing workloads across multiple computing resources, such as computers, a computer cluster, network links, central processing units or disk drives. Load balancing aims to optimize resource use, maximize throughput, minimize response time, and avoid overload of any one of the resources. Using multiple components with load balancing instead of a single component may increase reliability through redundancy. Thus, load balancing is widely used to enhance scalability and availability of a telecommunication and information technology (IT) applications.
In a typical load balancing implementation, a load balancer, generally implemented in a network device (thus referred to as a load balancing network device), is coupled to a number of servers (sometimes referred to as backend servers) that process packets transmitted from clients. The load balancer applies a load balancing policy to determine to which server the packets are to be sent.
Generally the load balancer applies the load balancing policy on per-packet basis. Yet, some telecommunication and IT applications offer session based services, where packets belonging to the same session can't be handled by different servers. In some of the applications, the source and destination addresses of the packets can't be altered in the packet forwarding through the load balancer. It is a challenging to accommodate these application in load balancing.