In recent years, there has been a movement in the computing industry towards the implementation of computing grids. In a computing grid, a plurality of distributed computing resources, such as processors, memories, non-volatile storages (e.g. hard drives), etc., are interconnected and shared. These resources may be dynamically provisioned and subdivided by a distributed resource manager (DRM) for purposes of executing jobs. For example, the DRM may assign one processor, one megabyte of memory, and fifty megabytes of disk space for executing a first job, and assign two processors, ten megabytes of memory, and one gigabyte of disk space for executing a second job. Because the DRM can dynamically provision the resources of the grid in almost any desired manner, the grid can be used flexibly to concurrently execute many different jobs.
One of the aspects of a grid is that, at the time a job is submitted, a set of computing resources needed to execute the job usually has to be specified. If it is not specified, then the DRM will either not accept the job, or will simply assign some default set of computing resources to the job, which may not resemble at all the resources actually needed to execute the job. Typically, the burden of determining what resources are needed to execute the job falls upon the user submitting the job. Unfortunately, the user often has no clue as to what resources will be needed; thus, the user is put in quite a dilemma.
The user can take the safe route and grossly overestimate the amount of resources needed. However, this has the potential disadvantage of delaying the execution of the job (the more resources that are required, the more likely the DRM will have to wait longer for the resources to become available; thus, the job may stay on an execution queue longer). This approach also has the disadvantage of reducing the efficiency of the grid. If resources are allocated to a job but are not used, then those resources are wasted; they could be better used for other jobs. Alternatively, the user can take the route of estimating what resources he/she believes will actually be sufficient to execute the job. Unfortunately, if the user underestimates the resources needed, then the job could be killed midstream (that is, the job may be killed by the DRM if, during execution, it exceeds its allotted resources, e.g. it tries to use more memory than it was assigned). Thus, neither option is particularly attractive to the user. Given this dilemma, an improved solution is needed.