In grid computing scheduling software, different resource allocation policies may be employed. Two common resource allocation policies are often referred to as a “stacked” approach and a “balanced” approach.
When implementing a stacked resource allocation policy, for each allocation, servers are selected in the order in which they are listed in a resource group during initialization. CPU slots are allocated from one server until all the CPU slots on that server are used. When all the CPU slots on that server are used, the next server listed in the resource group is selected and CPU slots from that server are allocated until all the CPU slots on that server are used.
When implementing a balanced resource allocation policy, for each allocation, servers are selected from a resource group based on the number of free CPU slots available on those servers. Slots are allocated first from the server with the highest number of free CPU slots. When all the CPU slots on that server are allocated, CPU slots are allocated from the next server with the highest number of free CPU slots.
The weak point of the aforementioned allocation policies is that an allocation request may be fulfilled with CPU slots from a single server if the server selected has enough idle slots. In today's computing environments, servers come with many CPUs and it is not uncommon to see servers configured with up to fifty (50) CPU slots. As such, any one application having this many or less instances may be allocated to CPU slots all residing on the same server when one of the foregoing resource allocation policies is employed. If that server goes down, it may result in a total loss of service for an application as all its instances will be impacted. This presents a significant operational risk for a grid computing environment.
Accordingly, there is a need for an improved resource allocation policy that is tolerant of individual server failures.