A computer cluster, referred to cluster for short, is a type of computer system which completes computing jobs by means of multiple collaborative computers (also known as computing resources such as software and/or hardware resources) which are connected together. These computing resources in a same management domain have a unified management policy and provide services to users as a whole. A single computer in a cluster system is usually called a node or a computing node.
The cluster system has many advantages. For example, the cluster system, when working in a load-balance manner, can achieve a higher efficiency through performing a same work by multiple computers. The cluster system may also work in a master/slave server manner. Once a master server fails, a slave server can provide services to users in substitute of the master server, thereby exhibiting a high fault-tolerance.
Corresponding computing resources should be allocated to a computing job to be implemented, since the cluster system comprises a great many of computing resources. The process is referred as job scheduling in a cluster environment. The job scheduling is actually a process for mapping jobs to corresponding resources for execution based on characteristics of the jobs and resources according to scheduling policies
In brief, the computing jobs are requesting resources and utilizing them. Traditional computing jobs are mainly related to high performance computing applications such as weather forecast, landform analysis and other similar large-scale science issues. Along with the pervasions of computer techniques in various fields, computations are increasingly demanded in some new fields such as on-line gaming, banking business, EDA designs and so on.
The computing jobs in such new fields are dramatically different from those in traditional academic institution. For example, traditional computing jobs mainly comprise a small number of jobs which are time-consuming and computing-intensive. Such jobs are essentially of same type (either serial or parallel). The computing jobs in the new fields are relatively complicated and have different requirements. They often′ comprise a large number of light-weighted jobs of which the types vary all the time, for example, the types can cover serial/parallel, real time/non-real time and so on.
Furthermore, the rapid development of computer techniques makes computing resource environments more complicated and the scale of a cluster larger. As an example, there may be millions of computing nodes in a cluster. The development of semiconductor techniques reduces the cost of hardware, which leads to a huge number of computing resources; for example, the numbers of CPUs, memories, IO devices and the like are increased significantly than before. Moreover, topology structures of computing resources develop from the original flat structure to complicated structures with layers, dimensions and etc. Additionally, the types of computer hardware and software increase steadily. In terms of hardware, servers for example include IBM x series, p series, BlueGene and etc. As to software, operation systems for instance include AIX, Linux, and Windows and etc.
FIG. 1 shows a typical scheduling approach employed in a single cluster scheduling system. As shown in FIG. 1, a job 101 in a job queue requests resources from a scheduler. A workload manager 102 in the scheduler walks through computing nodes 103 in computing resources to select a node set to be allocated to run the job. Then, the workload manager 102 will arranges the job 101 to run in the selected node set.
It can be seen from the above-mentioned scheduling approach that it is necessary to traverse many of nodes before identifying the one to allocate a job to when the job requests resources. This may be effective for a cluster system with a few of computing nodes. However, the scheduling efficiency is reduced markedly in a cluster scheduling system with a large number of computing nodes. Further, for some real time jobs (such as online gaming, banking business and the like), it is unacceptable to acquire the optimal solution with such a long time delay.
Additionally, as mentioned above, new types of computer hardware and software are continually developed. With current techniques, it is possible to produce large-scale computing nodes such as BlueGene or large Symmetric Multiple processing (SMP) computing node with 64 or even 128 CPUs, as well as low cost computing nodes such as blade servers.
Different types of computing nodes have different typical applicable situations and scheduling policies. As an example, a single node from large-scale computing nodes (such as Power server with 64 CPUs) has powerful computing, storage, and fault-tolerance capabilities and so forth. Moreover, it has plenty of various additional features such as Simultaneous Multiple Threads (SMT), affinity and the like. Therefore, in theory, large scale computing nodes can be shared by multiple medium and small-scale computing applications (for example, circuitry simulations, banking transaction processing and the like). Furthermore, such nodes are generally, expensive, and their scheduling policies particularly emphasize refinements in order to increase resource utilization, such as Backfill scheduling policy.
A single node from low cost computing nodes has relative inadequate computing, storage, fault-tolerance and other capabilities, as well as a small number of additional features. However, there are a great many such computing nodes due to the low cost, and their respective scheduling policies, such as round robin, tend to be unrefined.
In actual running, different applications have different requirements for resources, and even the same application may have different requirements in different execution phases. For example, a complete process of weather forecast application generally comprises pre-processing on raw data, calculating and some data post-processing. In the data pre-processing phase, the requirement for resources usually concentrates on IO performances. In the computing phase, the requirement for resources focuses on computing performances.
In view of the above factors, there may be different types of computing nodes in the same cluster. For instance, many actual clusters may be composed of multiple types of computing nodes, including those that are proficient at computing or IO performances, as well as the nodes with balanced performances.