Superscalar microprocessors have a plurality of execution units that execute the microinstruction set of the microprocessor. Superscalar microprocessors attempt to improve performance by including multiple execution units so they can execute multiple instructions per clock in parallel. A key to realizing the potential performance gain is to keep the execution units supplied with instructions to execute; otherwise, superscalar performance is no better than scalar, yet it incurs a much greater hardware cost. The execution units load and store microinstruction operands, calculate addresses, perform logical and arithmetic operations, and resolve branch instructions, for example. The larger the number and type of execution units, the farther back into the program instruction stream the processor must be able to look to find an instruction for each execution unit to execute each clock cycle. This is commonly referred to as the lookahead capability of the processor.
In a superscalar microprocessor with out-of-order execution, although instructions can execute out-of-order, they must retire in program order. Microprocessors that perform out-of-order execution require a buffer to retire microinstructions in program order, following execution. In some microprocessors, the buffer is called a reorder buffer, or ROB. The ROB has a fixed number of entries, and provides temporary storage for microinstructions and status information associated with each microinstruction. Retiring a microinstruction that is in the ROB includes storing the result of the microinstruction to architectural registers of the microprocessor and freeing (i.e., invalidating) the ROB entry occupied by the microinstruction so that a new microinstruction may be allocated an entry in the ROB.
The size, i.e., number of entries, of the ROB limits the lookahead capability of the processor. In particular, the size of the ROB limits the number of instructions that can be ready to be issued for execution, since an instruction must have a ROB entry allocated to it before it can be ready to issue. When all entries of the ROB are full, the oldest instruction must retire, i.e., update architectural state with its result, so that the ROB entry for the oldest instruction can be freed for re-allocation to a new instruction. One approach to increasing the lookahead capability of a microprocessor is to increase the number of entries in the ROB. However, each ROB entry takes a relatively large amount of space and power in the microprocessor to store its information, e.g., the instruction itself, temporary space for storing its result, and other information about the instruction. Therefore, making the size of a ROB large is a relatively costly way to increase the lookahead capability of a microprocessor.
Therefore, what is needed is a way to use the ROB in as efficient a manner as possible to improve performance through good execution unit utilization, while keeping the size of the ROB as small as possible.