The invention relates generally to computer architectures and more specifically to the management of pipelining in a superscalar, superpipelined processor.
Superscalar processors allow the execution of multiple instructions simultaneously. Historically, computer software has been generally programmed as a sequence of instructions, with each instruction to be executed before the one that succeeds it. However, if a processor executes the instructions serially, one instruction at a time, the performance of the processor is limited. Thus, superscalar processors provide performance improvements by executing several instructions at once.
A technique known as pipelining is used in superscalar processors to increase performance. Pipelining provides an xe2x80x9cassembly linexe2x80x9d approach to executing instructions. The execution of an instruction is divided into several steps. A superscalar processor is provided with a number of stages. Each stage performs a step in the execution of the instructions. Thus, while one step in the execution of one instruction is being performed by one stage of the processor, another step in the execution of another instruction may be performed by another stage of the processor. Since the execution of several instructions can be staggered across several stages, it is possible to begin a new instruction every clock cycle, even if the instructions require several clock cycles to be completed.
However, it is often necessary to know the result of one instruction before executing the instruction that succeeds it. If a pipelined superscalar processor attempts to execute an instruction for which antecedent instructions have not yet been fully executed, the pipeline may be forced to stop and wait until all antecedent conditions for the execution of the instruction have been met.
Superpipelining refers to pipelining using pipes with more than five stages. Superpipelining extends the benefits of pipelining, but increases the potential for delays caused by dependencies between instructions. Thus, a pipe may be forced to stop and wait several clock cycles in order to satisfy a dependency based on an instruction being processed in another pipe.
While a pipeline structure may be optimized for certain conditions, it is extremely difficult to optimize performance for all possible sequences of instructions. Thus, a technique is needed that improves pipeline performance beyond the level that can be achieved by changes to the pipeline structure.