1. Technical Field
The present invention relates in general to data processing systems and in particular to microprocessors. Still more particularly, the present invention relates to an improved method and system for dependency tracking and flush recovery for an out-of-order microprocessor.
2. Description of the Related Art
Early microprocessors executed only one instruction at a time and executed instructions in an order determined by the compiled machine-language program running on the microprocessor. Such microprocessors are known as “sequential” microprocessors. Various techniques, such as pipelining, superscaling, and speculative instruction execution, are utilized to improve the performance of sequential microprocessors. Pipelining breaks the execution of instructions into multiple stages, in which each stage corresponds to a particular execution step. Pipelined designs enable new instructions to begin executing before previous instructions are finished, thereby increasing the rate at which instructions can be executed.
“Superscalar” microprocessors typically include multiple pipelines and can process instructions in parallel using two or more instruction execution pipelines in order to execute multiple instructions per microprocessor clock cycle. Parallel processing requires that instructions be dispatched for execution at a sufficient rate. However, the execution rate of microprocessors has typically outpaced the ability of memory devices and data busses to supply instructions to the microprocessors. Therefore conventional microprocessors utilize one or more levels of on-chip cache memory to increase memory access rates.
Cache memory includes one or more levels of dedicated high-speed memory for storing recently accessed instructions and data. Cache memory technology is based on the premise that microprocessors frequently re-execute the same instructions and/or execute different instructions using recently accessed data. When data is read from main memory, the cache memory saves a copy of the data and an index corresponding to the location in main memory. The cache system monitors subsequent requests for data to see if any requested information is already stored in the cache. If the cache system finds that requested data is stored in the cache, often referred to as a cache “hit”, the data is delivered immediately to the microprocessor from the cache. If requested data is not currently stored in the cache, often referred to a cache “miss”, the requested data is fetched directly from main memory and saved in the cache for future use.
Superscalar microprocessors can process instructions simultaneously only when no data dependencies exist between the instructions in each of the pipelines. An instruction that depends on one or more preceding instructions to load required data into working operand registers cannot execute until all of the required data have been retrieved from cache or main memory. Furthermore, execution units can not predict how long it may take to load data into the working operand registers. Older microprocessors handled this uncertainty by delaying execution until the required data is fetched (i.e., by “stalling” the execution pipeline). This stalling was inconsistent with high-speed processing requirements.
Conventional microprocessors utilize speculative instruction execution to addresses pipeline stalls by enabling a second instruction that is data dependent on a first instruction to enter an execution pipeline before the first instruction has passed completely through the execution pipeline. Thus, in speculative execution microprocessors, the data dependent second instruction, which is often referred to as a consumer instruction, depends on the first instruction, which is referred to as a producer instruction.
In microprocessors that utilize speculative instruction execution, there is a delay between the decision to issue an instruction and the actual execution of the instruction. Thus, in the case of load instructions, there may be a significant delay between the issue of a load instruction and the corresponding data fetch from cache memory. A consumer load instruction, dependent on a delayed producer instruction, may be issued before confirmation by the cache system that the required load data required is available in the cache. When the required data is not found in the cache, dependent consumer load instructions can execute and access incorrect data.
In order to maintain correctness, microprocessors flush incorrectly executed speculative instructions and their results. Conventional microprocessors detect and correct such misspeculation by tracking instruction dependencies using large physical register mappers. The register mappers enable dependency chains to be established based on physical register names. However these register mappers are complex and typically rely on content-addressable memory (CAM) functions to concurrently evaluate large numbers of physical registers. To enable recovery from cache flushes, microprocessors also save a previous physical register state for each instruction group and for each physical register type, thus requiring a large amount of memory area and slowing the process of register state recovery. Consequently, an improved method for dependency tracking and flush recovery for an out-of-order microprocessor is needed.