A “Bag of Tasks” (BoT) application is a technique used in distributed memory computing systems, in which a quantity of work is divided into separate tasks that are placed in a list or “bag”. Each processor takes one of the tasks from the bag and, when that task is completed, takes another.
According to existing techniques, when a user wants to run a BoT application on a cluster (which is usually connected to a Grid), he or she submits a request that specifies the number of nodes in the cluster and the duration of the Job. Such a request is known as a “rigid request”. All nodes are then made available simultaneously for the BoT application.
However, the computing resources of such a distributed computing system may be neither homogeneous nor dedicated, so when an application is executed in such a system the allocation time must be specified in the resource request sent to a cluster resource manager. It is difficult to estimate the execution time of an application in such systems, which makes “rigid requests” impracticable. Furthermore, a BoT application does not require simultaneous access to all the processors in the cluster, so requiring that it do so delays the execution of the BoT application.
The user could in principle address this problem by submitting one request per task, but most cluster administrators limit the number of requests a user can have pending at any time, which limits the value of this approach.
One existing technique that attempts to overcome this limitation involves probing the cluster resource manager about the largest request that can currently be fulfilled, sending that request, and then repeating this strategy until the BoT application finishes. However, this approach is restricted by any policies in place in the cluster that limit individual resource consumption; indeed, such policies are often enforced even if there are idle resources.