The invention is generally related to data processing, and in particular to dispatching and issuing instructions in computer processors.
As semiconductor technology continues to inch closer to practical limitations in terms of increases in clock speed, architects are increasingly focusing on parallelism in processor architectures to obtain performance improvements. At the chip level, multiple processor cores are often disposed on the same chip, functioning in much the same manner as separate processor chips, or to some extent, as completely separate computers. In addition, even within cores, parallelism is employed through the use of multiple execution units that are specialized to handle certain types of operations. Pipelining is also employed in many instances so that certain operations that may take multiple clock cycles to perform are broken up into stages, enabling other operations to be started prior to completion of earlier operations. Multithreading is also employed to enable multiple instruction streams to be processed in parallel, enabling more overall work to performed in any given clock cycle.
In some existing designs, specific resources and pipelines are typically allocated for execution of different instruction streams, and multiple pipelines allow program execution to continue even during conditions when a pipeline is busy. However, resources may still be tied up for pipelines that are busy, and when all the pipeline(s) assigned to an instruction stream are busy, the instruction stream may become stalled, reducing the potential throughput of the processor core. Some existing designs are also limited in terms of the different types of instructions that may be supported, such that instructions are generally limited to supporting a single instruction width such as 32-bits, 64-bits, 128-bits, etc.