The size of computing systems vary from those systems that fit on a desktop and have a few microprocessors to large computing systems that fill large rooms and have specialized electrical and cooling requirements. Such large systems often have as many as 20,000 computing nodes, with each computing node having up to 40 or more central processing units (CPUs) cooperatively executing computing jobs. Computing nodes are computing entities with various resources that may be used to execute computing jobs.
As computing systems continue to grow in size and complexity, scheduling computing jobs within those computing systems becomes increasingly important, in order to efficiently use the available resources in the system.
Traditionally, scheduling within a computing system includes examining computing jobs in the order the jobs arrive to determine whether the proper resources are available to execute a specific computing job immediately. If sufficient resources are not available to execute the computing job, the computing job is placed in a waiting queue until the required, previously unavailable resources become available. Once the required resources become available, the computing job is removed from the waiting queue and executed on a computing node with the required resources.