Technical Field
The present disclosure relates generally to information processing systems and, more specifically, to processors that utilize predication.
Background Art
In modern processor designs, one method of increasing performance is executing multiple instructions per clock cycle. The performance of such processors is dependent on the amount of instruction level parallelism (ILP) exposed by the compiler and exploited by the microarchitecture. One approach for increasing the amount of ILP that is available for compile-time instruction scheduling is predication. Predication is also useful for decreasing the number of branch mispredictions.
Predication is a method of converting control flow dependencies to data dependencies. A predicated execution model is an architectural model where an instruction is guarded by a Boolean operand whose value determines if the instruction is executed or nullified. For example, the Explicitly Parallel Instruction Computing (“EPIC”) architecture utilized by Itanium® and Itanium® 2 microprocessors features a set of 64 predicate registers to support conditional execution of instructions by providing the Boolean predication operand.
To explore ILP, a compiler can take full advantage of predication by applying a technique referred to as if-conversion to convert control flow dependence into data flow dependence. With if-conversion, the compiler can collapse multiple control flow paths and schedule them based only on data dependencies.
The Itanium® and Itanium® 2 microprocessors, for instance, support predication and issue instructions in program order. However, predication may also provide performance benefits for processors that allow instructions to issue out-of-order. An out-of-order, execution model is, in general, more complex than a static execution model. Static execution executes code in the order as scheduled statically by the compiler while out-of order execution permits the processor to dynamically adjust instruction scheduling to the run-time behavior of the program.
Because of this ability to adapt to the run-time environment, dynamic out-of-order execution has been employed in many processor designs. For processors that allow instructions to issue out of order, register renaming is used to increase the number of instructions that a superscalar processor can issue in parallel. Renaming each independent definition of an architectural register to different physical registers in a physical register file improves parallelism by preventing dependence-induced delays. Renaming removes WAR (write-after-read) and WAW (write-after-write) dependencies and allows multiple independent instructions that write to the same architectural register to issue concurrently.
Efficiency of the renaming mechanism in an out-of-order processor may drive processor performance. That is, the renaming mechanism and its associated physical register file may represent critical resources for an out-of-order processor architecture. Implementation of predication on such processors poses interesting issues.