In a traditional system, batch job requests can be executed immediately or can be scheduled to execute at a later time. These jobs typically share common resources such as hardware, software and data. System overload, resource contention, and/or deadlock situations are major reasons for batch jobs to fail.
One method used to prevent batch jobs from failing is to have a system limit (e.g., a workers' queue limit) on the number of batch jobs handled by the system. However, the system limit does not take into account that system overload, resource contention, and/or deadlock situations can occur before the system limit is reached. When either of these situations occurs, the system cannot prevent users from submitting new jobs. As a result, the state of the system is made worse when users, unaware of the current state of the system, continue to submit new jobs. Thereafter, an administrator has to take appropriate action to restore the system.
These are the areas that embodiments of the invention are intended to address.