Numerous web-based applications offer e-commerce services that must be reliable and time-bound. Application service providers strive to offer the best quality of service to their customers. One metric for quality of service (commonly referred to as “QoS”) from the client perspective is the time taken by the web server to service a request. For a server, however, providing a uniform delay guarantee for all customer requests is neither practical nor efficient. Clearly different requests will consume different amounts of server resources and will presumably require differing amount of time for completion. In addition, certain types of requests may be “time critical” while others are tolerant of longer delays. Accordingly, service providers offer different time or delay guarantees based on the QoS or service speed, with clients paying higher rates for faster service, often referred to as “gold service.”
The offered service time differentiation needs to be based on various attributes of the request type, the identity of the requesting client, and the type or class of service to which the client subscribes (i.e., their level of account guarantees). Clearly, also, the load on the server factors into the delay incurred by a request, with the load being a function of the number of requests and the resource requirements of each request. In order to provide service time differentiation, the server has to control the multiple resources used, isolate the requests from other classes, and, control the number and type of incoming requests. Since incoming workload is not fixed, a simple static allocation of resources does not achieve the desired control.
It is, therefore, an objective of the present invention to provide a system and method for adaptive admission control of client requests in order to meet service time guarantees.
Yet another objective of the present invention is to provide dynamic partitioning of resources in order to meet service time guarantees.