1. Technical Field
The present invention relates in general to data processing systems and in particular to processors. Still more particularly, the present invention relates to an improved method and system for tracking instruction dependency in an out-of-order processor.
2. Description of the Related Art
Early processors executed only one instruction at a time and executed instructions in an order determined by the compiled machine-language program running on the processor. Such processors are known as “sequential” processors. Various techniques, such as pipelining, superscaling, and speculative instruction execution, are utilized to improve the performance of sequential processors. Pipelining breaks the execution of instructions into multiple stages, in which each stage corresponds to a particular execution step. Pipelined designs enable new instructions to begin executing before previous instructions are finished, thereby increasing the rate at which instructions can be executed.
“Superscalar” processors include multiple pipelines and can process instructions in parallel using two or more instruction execution pipelines in order to execute multiple instructions per processor clock cycle. Parallel processing requires that instructions be dispatched for execution at a sufficient rate. However, the execution rate of processors has typically outpaced the ability of memory devices and data busses to supply instructions and data to the processors. Therefore conventional processors utilize one or more levels of on-chip cache memory to increase memory access rates.
Superscalar processors can execute instructions simultaneously only when no data dependencies exist between the instructions undergoing execution in the pipelines. Thus, an instruction that depends on one or more preceding instructions to load required data into working operand registers cannot execute until all of the required data have been retrieved from cache or main memory. Furthermore, execution units cannot predict how long it may take to load data into the working operand registers. Older processors handled this uncertainty by delaying execution until the required data is fetched (i.e., by “stalling” the execution pipeline). Stalling instruction execution until data dependencies are resolved is inconsistent with high-speed processing requirements.
Consequently, conventional processors utilize speculative instruction execution to address pipeline stalls by enabling a second instruction that is data dependent on a first instruction to enter an execution pipeline before the first instruction has passed completely through the execution pipeline. Thus, in speculative execution processors, the data dependent second instruction, which is often referred to as a consumer instruction and which depends on the first (or producer) instruction, begins execution speculatively in order to avoid a pipeline stall.
In order to maintain correctness, processors flush incorrectly executed speculative instructions and their results. Conventional processors detect and correct such misspeculation by tracking instruction dependencies using large physical register mappers. The register mappers enable dependency chains to be established based on physical register identifiers. However these register mappers are complex and typically rely on content-addressable memory (CAM) functions to concurrently evaluate large numbers of physical register identifiers. To enable recovery from flushes, processors also save a previous physical register state for each instruction group and for each physical register type, thus requiring a large amount of memory area and slowing the process of register state recovery.