1. Field of the Invention
The present invention generally relates to resource allocation. More specifically, the present invention relates to dynamic resource allocation on computer systems that make use of multi-core processing units. The present invention further related to networks of computers with a plurality of computational modes, which may further implement multi-core processing units.
2. Description of the Related Art
FIG. 1 is a block diagram of a multi-core based computer system 100 as might be found in the prior art. The computer system 100 of FIG. 1 includes a collection of logical and physical resources, a number of which may be shared between work load units along various possible dimensions, such as shareable in time or shareable through partitioning. Embodiments of the present invention may be implemented in the general context of a computer system 100 like that illustrated in FIG. 1.
In the system 100 of FIG. 1, four central processing units (CPUs) 6 are illustrated, each of which may be capable of independently processing a job. For any given CPU 6 to be able to process a job, that CPU 6 must have available some or all of the computer memory 9, level 2 cache 8, multiplexed bus bandwidth 10, and possibly the exclusive use of some or all of the input/output (I/O) channels 11. Access to these system resources can be shared in both quantity and time.
For example, the CPUs 6, the level 2 cache 8, multiplexed bus 10, and the I/O channels 11, and the memory 9 can be considered shareable resources of the computer system 100. The level 1 cache 7 resources are implicitly tied to the CPUs 6 in system 100 in that they are shareable with the jobs, but not independently of the CPUs 6.
The system 100 of FIG. 1 may be presented with a stream of tasks or jobs, each of which requires a spectrum of shareable resources to be available for dispatch and execution. The computer system 100 will, ultimately, be faced with the problem of scheduling the execution of the individual jobs in some manner that will cause all of them to be executed by the system 100. The scheduling problem can have many dimensions to its resolution in that most job schedulers are designed to achieve some measurable goal, such as maximizing the throughput of jobs, minimizing the response time of jobs, or achieving some constraint in the processing of the jobs such as the respecting of deadlines for job completion.
Before any job can be dispatched to the computational resource, however, the job scheduler must be capable of assigning to the job all—and not less than all—of the shareable computer resources that are needed to run the particular job. For example, if a job needs a specific amount of main memory to run, it may not be dispatched until the job scheduler can be certain that the specific amount of memory is available for use by the job. An identical constraint exists with respect to execution of all other pending jobs.
A simple albeit limited strategy for the scheduling of jobs on the computational resource is to limit the number of jobs running on the system at anyone time to exactly one job. Presuming that it is feasible to run all of the jobs in the request queue, each of those jobs will then run in some sequence determined by the job scheduler-one at a time.
This strategy is inefficient with respect to the use of available computer resources in that a given job will, generally, be unable to consume 100 percent of computer systems resources over the elapsed time needed to complete processing. Available resources are thus left idle or, at best, underutilized. For example, data transfer between computer memory and relatively slower speed devices means that the processing unit of the computer will spend time waiting for the completion of slow speed events during processing oh the job. This ‘idling’ means that other pending jobs could be exploiting idle system resources through overlapping processing whereby the resources of the computer system can be shared between jobs.
Sharing resources may take any number of forms such as a scheme based on swapping of resources, partitioning, or some combination of both. For example, many computer systems implement a swapping mechanism for the sharing of memory whereby the memory resident components of a job are copied to a disk storage unit and replaced by the disk based image of the memory resident components of another task. In those environments where the processing unit is a single-core device or a monolithic processing element, time on the computing resource may be shared with a number of jobs using swapping whereby the core is partitioned and then allocated according to some heuristic policy.
In those instances where a computer system is presented with a job stream that does not represent an over-consumption of resources, job throughput may steadily improve until all of the shareable resources are fully consumed. A consequence of over-consumption, however, is that response time or service level of the system will degrade to the point where the mean arrival rate of a job at the input queue exceeds the mean processing time of the jobs. The effect on job processing is that the amount of time a particular work unit spends waiting for access to a resource increases with the number of jobs in the system.
There is a need in the art for the automatic and systematic scheduling of jobs in a computer system to optimize job throughput while simultaneously minimizing the amount of time a job waits for access to a shareable resource in the system.