Generally, a load balancing system includes a set of clients and a set of servers with a unit, referred to as a “load balancer,” between them. The clients issue a service or a content request to be serviced by any of the servers in the set of servers, commonly referred to as a “server farm.” The load balancer determines which server in the server farm is going to service the client request.
Since different servers in the server farm will have a different capacity to handle requests (i.e., different capacity in the number of connections the servers can service), the load balancer will attempt to optimally distribute the incoming client requests to the servers in the server farm so that none of the servers become overloaded and that services retain high availability. Distributing workload, such as client requests, among servers in the server farm in a manner that attempts to optimize resource utilization, maximize throughput, minimize response time and avoid overload is referred to as “load balancing.”
Currently, one of the main techniques for implementing load balancing is the “weighted least connection scheduling” technique. In such a technique, the servers in the server farm are assigned a weight based on the capacity to handle client requests. Servers with a higher weight value receive a larger percentage of connections at any one time. When there is a client request, the load balancer uses this weight to determine the percentage of the current number of connections to give each server. As a result, more requests are distributed to those servers with fewer active connections relative to their capacities (assigned weight).
However, the weighted least connection scheduling technique initially distributes one connection to each of the servers irrespective of their weights which may not effectively achieve load balancing. Furthermore, the weighted least connection scheduling technique distributes the new client request in a manner that neutralizes the current state of instability which may in the future not turn out to be the best possible distribution of requests among the servers to achieve effective load balancing.
Other load balancing techniques suffer drawbacks as well. For example, the weighted round-robin scheduling approach can lead to the overloading of one server while under utilizing the other servers in the server farm.
As a result, the current load balancing techniques may not distribute the incoming client requests among the servers in the server farm in such a manner as to optimize load balancing (i.e., optimize resource utilization, maximize throughput, minimize response time and avoid overload).