“Cloud computing” services provide shared resources, software, and information to computers and other devices upon request. In cloud computing environments, software can be accessible over the internet rather than installed locally on in-house computer systems. Cloud computing typically involves over-the-Internet provision of dynamically scalable and often virtualized resources. Technological details can be abstracted from the users, who no longer have need for expertise in, or control over, the technology infrastructure “in the cloud” that supports them.
Clouds may provide a collection of computing resources that can be utilized to perform batch processing of a large number of work items (sometimes referred to as “jobs” or simply as “items”). Items that need to be processed are typically placed in a queue, where they wait to be processed. The cloud may allocate a certain amount of computing resources for processing items in the queue. The cloud can then process the items in the queue using the allocated computing resources, for example, in a first-in-first-out (FIFO) manner. The number of items in the queue can vary over time depending on the number of new items that are added to the queue, the computational complexity of the items in the queue, the amount of resources allocated for processing items in the queue, and other factors.
The cloud can dynamically allocate computing resources for processing items in a given queue. One way that the datacenter dynamically allocates computing resources is by allocating computing resources in a simple linear fashion with respect to the number of items in the queue. Stated differently, the cloud allocates computing resources such that the amount of computing resources that are allocated for processing items in the queue increases linearly with respect to the number of items in the queue. For example, one formula for determining the amount of computing resources to allocate for processing items in a queue is X/D, up to M, where X is the number of items in the queue, D is the number of items per computing resource, and M is the maximum amount of computing resources that can be allocated. Allocating computing resources based on this formula may lead to poor computing resource utilization. For example, if D and M are set to 500 and 20, respectively, then a queue that has 1,000,000 items would be allocated 20 computing resources (the maximum amount allowed), which may not be sufficient. If D and M are set to 5,000 and 100, respectively, then a queue that has 4,000 items would be allocated a single computing resource, and thus process slowly. D can be lowered to allocate more computing resources. However, if multiple systems use the same formula for allocating computing resources, computing resources may be consumed too quickly if M is not lowered. If both D and M are lowered, then queues having a relatively small number of items will process quickly, but queues having a relatively large number of items will not be allocated sufficient computing resources. As such, a simple linear computing resource allocation scheme such as the one described above does not provide the flexibility to accommodate efficient processing for both large and small numbers of items in a queue.