This section introduces aspects that may be helpful in facilitating a better understanding of the inventions. Accordingly, the statements of this section are to be read in this light and are not to be understood as admissions about what is in the prior art or what is not in the prior art.
In some known load balancing systems, a service may be hosted on a number of servers in order to provide higher quality of service such as high throughput and low response time. A typical architecture may include a front-end load balancer and a fixed number of back-end servers. The front-end load balancer may be a dedicated hardware box which distributes the incoming requests to different servers using a conventional distribution policy such as: random, round robin, or load-based distribution. In this model, all of the server nodes are active and ready to serve the service requests.
In other known load balancing systems, a front-end load balancer sends new requests to the most loaded server as long as the server can satisfy the performance constraints in order to create more opportunity for servers to stay in low energy states. However, this solution is not scalable as tracking the load of a large number of heterogeneous servers requires a sophisticated and expensive front-end load balancer.