As semiconductor technology continues to inch closer to practical limitations in terms of increases in clock speed, architects are increasingly focusing on parallelism in processor architectures to obtain performance improvements. At the integrated circuit device, or chip level, multiple processing cores are often disposed on the same chip, functioning in much the same manner as separate processor chips, or to some extent, as completely separate computers. In addition, even within cores, parallelism is employed through the use of multiple execution units that are specialized to handle certain types of operations. Pipelining is also employed in many instances so that certain operations that may take multiple clock cycles to perform are broken up into stages, enabling other operations to be started prior to completion of earlier operations. Multithreading is also employed to enable multiple instruction streams to be processed in parallel, enabling more overall work to performed in any given clock cycle.
The net result of applying the aforementioned techniques is an ability to provide multithreaded processing environment with a pool of hardware threads distributed among one or more processing cores in one or more processor chips and in one or more computers, and capable of processing a plurality of instruction streams in parallel. It is expected that as technology increases, processor architectures will be able to support hundreds or thousands of hardware threads, and when multiple processors are combined into high performance computing systems such as supercomputers and massively parallel computers, a potential exists to support millions of hardware threads. With such multithreaded processing environments, workload distribution across the large number of hardware threads is an increasingly important factor in realizing the efficiencies of parallel processing.
Therefore, a significant need exists in the art for a manner of efficiently distributing workloads in a multithreaded environment to maximize performance in the multithreaded environment.