The present invention generally relates to job scheduling. More particularly, the present invention is directed to automatically optimizing execution time of jobs when dispatching them over a network of computers.
Workload schedulers schedule job streams (workflows) on local or remote networks of computers. Job scheduling refers to the scheduling of jobs, including either batch jobs, interactive tasks, or any job for which the execution can be scheduled. In order to reach predefined performance targets (defined for instance, in a Service Level Agreement with a customer for whom the job execution service is provided), the performance of job execution, the job duration time, must be carefully monitored and optimized. Consequently, there is a need to optimize the use of computers in the network when dispatching the job for execution.
The usual way to optimize scheduling of jobs consists of preparing a scheduling plan according to the resources to be used by each job and the constraints in terms of execution software environment and machine resources. The association between workflows and computers is usually static. It may be necessary, however, to have a dynamic capability of adapting load over a set of computers, by re-adapting the load to the available computers at execution time for instance. Simple dispatching of jobs on distributed computers can be done according to a policy, such as sending the job on the computer having the lowest CPU instant utilization or according to a round robin algorithm in order to ensure that the spread workload is equitably dispatched on the computers. However, these algorithms do not permit to take into account all the computer instant situations. U.S. Pat. No. 7,302,450 discloses an efficient way to select a prioritized set of jobs among jobs planned for execution on a computer in order optimize the use of computing resources on this computer. The algorithm used to select the jobs to be executed in priority is based on the knowledge of the list of jobs to be scheduled, the job resource consumption statistics and predefined threshold conditions which depend on the computer resource capacities. This solution provides a sophisticated algorithm for selection of the computers that are to execute the jobs but requires that a lot of information be collected about the jobs and require knowledge of each computer's physical computing resource capacity.
There is accordingly a need to dispatch job for execution on a network of computers in order to minimize job execution time on those computers while avoiding collecting and processing too much information from the computers because this is a waste of computing resources.