1. Field of the Invention
This invention relates generally to pipelined processors, and more particularly, to a replay mechanism for a processor pipeline.
2 Description of the Related Art
Computers and many other types of machines are engineered around a "processor" that executes programmed instructions stored in the machine's memory. One may categorize computers and processors by the complexity of their instruction sets, such as reduced instruction set computers ("RISC") and complex instruction set computers ("CISC"). An architecture is a categorization defining the interface between the processor's hardware and the processor's instruction set.
A first aspect of a processor's architecture is whether it executes instructions sequentially or out of order. Historically, processors executed one instruction at a time or in the same sequential order that code for the instructions was presented to the processor. This architecture is the "sequential programming model." An out of order architecture executes instructions in an order different from the order in which the code is presented to the processor, i.e., non-sequentially.
The sequential nature of software code creates "data dependencies" and "control dependencies." A data dependency occurs when a later instruction manipulates an operand x, and the data at x is a result from an earlier instruction. The later instruction has a data dependency on the operand of the earlier instruction. A control dependency occurs when an instruction can generate two alternative branches of instructions only one of which will be executed. Typically, the branch choice depends on a condition. The various architectures respect these data and control dependencies.
A second aspect of a processor's architecture is whether instruction processing is "pipelined." In pipelined processing, the processor fetches instructions from memory and feeds them into one end of the pipeline. The pipeline has several "stages," each stage performing some function necessary or desirable to process the instruction before passing the instruction to the next stage. For instance, one stage might fetch an instruction, the next stage might decode the instruction, and the next stage might execute the decoded instruction. Each stage typically moves the instruction closer to completion.
A pipeline may offer an advantage in that one part of the pipeline is working on a first instruction while a second part of the pipeline is working on a second instruction. Thus, more than one instruction can be processed at a time potentially increasing the effective rate at which instructions are processed.
Some pipelines process instructions "speculatively." Speculative execution means that instructions are fetched and executed before resolving pertinent control and/or data dependencies. Speculative execution predicts how data and/or control dependencies will be resolved, executes instructions based on the predictions, and then verifies that the predictions were correct before retiring the instruction and results therefrom.
The verification step can be a challenge to pipeline design. At the end of the pipeline, the results from executed instructions are temporarily stored in a register until all data and control dependencies have been actually resolved. The pipeline then checks whether any mispredictions or other problems occurred, i.e., both generally referred to as exceptions. In the absence of execution problems, the executed instructions are "retired" and results are stored to architectural registers, an operation referred to as "commitment to an architectural state." If execution problems occur, the processor performs a correction routine.
Execution problems are problems that can result in:
(1) executing an instruction that should not have been executed; PA1 (2) not executing an instruction that should have been executed; or PA1 (3) executing an instruction with incorrect data. To process the instruction stream correctly, the effects of execution problems on subsequent execution of instructions must also be corrected.
Many prior art pipelined processors "stall" the pipeline upon detecting an exception. In stallable instruction pipelines, a number of latches or registers govern progress through the stages of the pipeline. A pipeline controller generates a signal to enable or disable the latches or registers. During a stall, the latches or registers are disabled so that the instructions are not transferred to the next stage. After an exception that caused the stall and its effects are repaired, the pipeline controller re-enables the latches or registers and transfers between pipeline stages resume.
To operate a stallable pipeline, the pipeline controller needs to receive status signals from the stages of the pipeline, determine whether to stall from the received signals, and then broadcast a signal to stall or proceed. Since each of these steps takes time, implementing the ability to stall may limit the operating frequency of the pipeline.
Some processor pipelines "replay" in addition to stalling. Replay is the re-execution of instructions upon detecting an exception. If an exception is detected, speculative results are ignored, e.g., the architectural state is not updated and instructions are not retired. The processor corrects the problem and re-executes the instructions.
One processor employing replay is the Alpha 21164 microprocessor, commercially available from Digital Equipment Corporation. The Alpha 21164 stalls only the first three stages of the pipeline. If a problem occurs after the third stage, the Alpha 21164 replays the entire pipeline after the repairing problem. The Alpha 21164 therefore combines expensive stalling with complex decision-making circuitry necessary to determine when to replay. The Alpha 21164 replays the entire pipeline line even though the problem may be localized. Replaying the entire pipeline may be inefficient if there are several parallel execution units, e.g., a superscalar processor, and the problem was localized to one of the parallel execution units.
The demand for faster processors continually outstrips present technology. The demand pressures all aspects of processor architecture to become faster in the sense of higher instruction throughput. Current techniques for handling exceptions in pipelines processing can substantially reduce instruction throughput.
The present invention is directed to overcoming, or at least reducing the effects of, one or more of the problems set forth above.