The present invention relates generally to the field of operation of aggregate computing resources, and more particularly to dynamic resource sharing and service level agreement (SLA) enforcement.
Environments in which computing resources are shared by multiple entities, such as an enterprise environment, typically have multiple applications that consume services from grid resources. Each application can use more than one type of service, and may be associated with one or multiple consuming entities, for example, different business units within an enterprise organization.
Management of shared computing resources can involve providing an agreed-to level of resources, referred to as a service level agreement (SLA), for each entity sharing resources within an interconnected consolidation of computing resources known as a cluster or grid. Management of shared computing resources also requires efficiency in utilization of resource assets, minimizing idle time, but enabling dynamic sharing of resources as demands of consuming entities vary with respect to time.
In a distributed computing grid environment, client applications submit workload requests to a workload manager. The workload requests generate sessions that include tasks that the workload manager schedules on available resources that can be “deserved” or shared. A “deserved” level or resources aligns with the agreed-to resource level of a SLA, for an application associated with a consuming entity of a distributed computing grid. Shared resources may be allowed under SLA management when resources are idle such that a client application may consume resources above the agreed-to level of its SLA as long as resource demand is below capacity. However, shared resources may be reclaimed by another client application, when its workload demand increases and resources previously idle are now required.
When a client application associated with an entity of the distributed computing grid, submits a large workload to a workload manager, the workload manager will request additional resources to execute the high demand of workload. The resource manager consults a resource plan to determine the agreed-to resources and allocates the agreed-to level of resources to the workload manager for the client application. If additional resources are required by the submitted workload and resources are idle, the resource manager may allocate additional resources to the requesting workload manager to perform workload demand in excess of the agreed-to resource level.
An area of concern in maintaining high efficiency levels in a shared computing grid is the time needed to start or initialize a service instance, especially for cases in which the service instance has to load a large software library or a large amount of data at initialization. If such service instances are started and terminated frequently, performance will suffer for the application calling the service instances.
When resource consumption of a service operating on a specific resource, referred to as a “slot”, is very large, there is a risk of overloading the slot. Avoiding situations in which multiple services with large memory consumption occupy a slot at the same time, is preferred to prevent performance issues resulting from the operating system resorting to swapping in order to free memory.