Field of the Invention
The present invention relates to the field of service level agreement enforcement and more particularly to service level agreement enforcement in a distributed caching infrastructure.
Description of the Related Art
In an efficient admissions control and capacity planning policy, minimal resources can be allocated automatically to satisfy the requirements of a specified service level agreement (SLA), leaving the remaining resources for later use. An SLA is an agreement between a computing service provider and a computing service consumer that specifies a minimum level of service to be provided by the service provider on behalf of the consumer. The typical SLA includes one or more network traffic terms that either limit the amount and type of resources that the subscribing customer can consume for a given rate, or guarantee the amount and quality of service (QoS) of resources that the provider will provide to the subscribing customer for a given rate.
For example, a subscribing consumer can agree to an SLA in which the consumer agrees to consume only a particular quantity of network bandwidth offered by the provider. Conversely, the SLA can require the provider to guarantee access to the subscribing consumer to at least a minimum amount of bandwidth. Also, the SLA can require the provider to provide a certain QoS over the provided minimum amount of bandwidth.
When considering the terms of an SLA, content and application hosts provision server resources for their subscribing customers, co-hosted server applications or services, according to the resource demands of the customers at their expected loads. Since outsourced hosting can be viewed as a competitive industry sector, content and application hosts must manage their resources efficiently. Logically, to ensure that the customers receive the promised level of service in the SLA, content and application hosts can be configured to survive a worst-case load. Yet, the worst-case approach can unnecessarily tax the resources of the content host or the application host as the case may be, even when those resources are not required to service a given load. Hence, rather than over-provisioning resources, efficient admission control and capacity planning policies can be designed merely to limit rather than eliminate the risk of meeting the worst-case demand.
While SLA management and enforcement has become part and parcel of ordinary application hosting relationships between consumer and host, Extreme Transaction Processing (XTP) provides new challenges in the use and enforcement of the SLA. XTP is a technology used by application hosts to handle exceptionally large numbers of concurrent requests. Serving such a large volume of concurrent requests can be made possible in XTP by distributing the load resulting from the concurrent requests on computer clusters or whole grid computing networks. Further, general XTP supporting architectures often rely upon aggressive caching across an n-Tier caching infrastructure (a multi-tiered cache structure), affinity routing (the intelligent routing of a request to business logic executing nearest to the requisite data consumed by the business logic), and decreasing data-access latency via the “MapReduce” framework commonly used to support distributed computing on large data sets on clusters of computers. Thus, in an XTP supporting architecture it can be critical to monitor the performance of the n-Tier cache and to adjust the configuration of the n-Tier cache in order to meet the terms of a corresponding SLA for each customer.