In grid based computer implementations, it is desirable to be able to distribute work among a plurality of interconnected nodes forming a grid computing environment. Conventional approaches to the problem typically employ a distributed resource manager that allocates work to nodes having available computing resources. However, the conventional approaches are batch oriented—that is, conventional approaches rely upon the work being able to be processed by the computers comprising the grid as a background task. In other words, in such conventional approaches, there are typically no users waiting with active sessions relying upon the results to be provided within a relatively short period of time.
Conventional approaches typically involve the use of a statically provisioned computing grid. Accordingly, the distributed resource manager may be apprised of only the workload and capabilities of the computers in the grid. Since computing grids are conventionally not used to support scalable distributed persistent applications (SDPA), i.e., a program without a definite termination point, there is no way to determine based upon the performance of a persistent application how that application is performing and whether additional resources need to be dedicated to the application. The result of such a deficiency is that when a persistent application, such as a web server, is met with a surge in demand, such as experienced by many news sites during the 9/11 attacks, such systems are not capable of adjusting to handle the increased load. In one possible approach, a larger amount of resources could be statically allocated to the application in order to provide a large safety factor. However, the excess resources would typically be idle most of the time, leading to waste and inefficiency.