1. Technical Field
The present invention relates in general to data processing and, in particular, to the execution of load instructions by a processor. Still more particularly, the present invention relates to a processor that buffers load data for out-of-order load instructions in order to reduce the performance penalty associated with data hazards.
2. Description of the Related Art
A typical superscalar processor can comprise, for example, an instruction cache for storing instructions, one or more execution units for executing sequential instructions, a branch unit for executing branch instructions, instruction sequencing logic for routing instructions to the various execution units, and registers for storing operands and result data. In order to leverage the parallel execution capabilities of these multiple execution units, some superscalar processors support out-of-order execution, that is, the execution of instructions in a different order than the programmed sequence.
When executing instructions out-of-order, it is essential for correctness that the processor produce the same execution results that would have been produced had the instructions been executed in the programmed sequence. For example, given the following sequence of instructions:                LOAD1        ADD        STORE        . . .        LOAD2where LOAD1 and LOAD2 target the same address and LOAD1 precedes LOAD2 in program order, LOAD2 cannot be permitted to receive older data than LOAD1. However, if LOAD2 is executed prior to (i.e., out-of-order with respect to) LOAD1, LOAD2 may receive older data than LOAD1 if the intervening STORE is targeted at the same address or if another processor within the same computer system stores to the same address. A scenario in which an out-of-order executed load instruction receives incorrect data is defined herein to be a data hazard.        
Superscalar processors that support out-of-order execution of load instructions typically detect and correct for data hazards by implementing a load queue that stores the target address of each load instruction that was executed out-of-order. Following execution of the out-of-order load instruction, addresses of exclusive transaction (e.g., read-with-intent-to-modify or kill) driven on the computer system interconnect by other processors, as well as store instructions preceding the load instruction that are initiated by the processor itself, are snooped against the entries within the load queue. If a snooped exclusive transaction or a local store operation hits within the load queue, the entry is marked, for example, by setting a flag.
Thereafter, when the processor executes a load instruction, the processor determines whether or not the load instruction precedes the out-of-order load instruction in program order and whether or not the subsequently executed load instruction targets an address specified in a marked entry in the load queue. If so, a data hazard is detected, and the processor flushes and re-executes at least both load instructions, and possibly all instructions in flight following the first of the two load instruction in program order. Flushing and re-executing instructions in this manner to remedy data hazards results in a significant performance penalty, particularly for processors having wide instruction execution windows.