Load balancing is used to distribute load among compute servers to ensure that no one server is overloaded. There are basically two categories of load balancing techniques: static and dynamic.
In static load balancing, a fixed technique is used to distribute the load. Examples are round robin techniques where the next transaction is allocated in round robin fashion to the servers. These techniques do not take into account the variation in the requests and the different loads therein.
In dynamic load balancing, the state of the system is taken into account when scheduling the next transaction. Most of the measures are performance measures (like CPU utilization) or the number of requests being currently served by an application.
An example of a previous load balancing technique is given in “The Case for SRPT Scheduling in Web Servers”, Mor Harchol-Balter, Mark Crovella, SungSim Park, MIT-LCS-TR-767, October 1998, available online at cs.cmu.edu/˜harchol/Papers/papers.html
The assumption in previous load balancing techniques is that all the servers in the load balanced pool are of the same type. If this is not the case, it is difficult to load balance using performance measures that are not normalized. For instance, a high end server with 95% CPU utilization may be able to serve a request faster compared to a low end server with 80% CPU utilization.
Service Level Agreements (SLAs) are commonly provided which determine a minimum level of service to be provided by a computer system. Each SLA will include a maximum response time which is permitted under the SLA.
An object of the invention is to provide an alternative load balancing technique which takes resource response times into account in order to comply with SLAs.