Field of the Disclosure
This disclosure relates generally to parallel computing, and more particularly to systems and methods for dynamic co-scheduling of hardware contexts for parallel runtime systems on high-utilization shared machines.
Description of the Related Art
Traditionally, parallelism has been exploited in high performance computing (HPC) and multi-threaded servers in which jobs are often run on dedicated machines, or on fixed sets of cores (or hardware contexts) in a shared machine. Traditional HPC jobs have long, stable CPU-bound phases with fixed resource requirements. Traditional servers exploit the ability to process independent requests in parallel. There is often little parallelism within each request. This style of synchronization lets traditional servers run well on current operating systems.
In contrast, many emerging parallel workloads exhibit CPU demands that vary over time. For example, in graph analytic jobs, the degree of parallelism can both vary over time and depend on the structure of the input graph. Other examples include cases in which parallelism is used to accelerate parts of an interactive application (occurring in bursts in response to user input). Current operating systems and runtime systems do not perform well for these types of workloads (e.g., those with variable CPU demands and frequent synchronization between parallel threads). Typical solutions attempt to avoid interference between jobs either by over provisioning machines, or by manually pinning different jobs to different cores/contexts.
Software is increasingly written to run on multi-processor machines (e.g., those with multiple single-core processors and/or those with one or more multi-core processors). In order to make good use of the underlying hardware, customers want to run multiple workloads on the same machine at the same time, rather than dedicating a single machine to a respective single workload.