1. Field of the Invention
The present invention relates to techniques for improving the performance of computer systems. More specifically, the present invention relates to a method and apparatus for performing register file checkpointing in an efficient manner to support speculative execution within a computer system.
2. Related Art
Speculative execution techniques (i.e. mechanisms that allow a microprocessor to execute instructions before it is known whether they should be executed and/or what their operands are) are becoming an increasingly important means of improving the performance of microprocessors. To enable speculative execution to work correctly, current microprocessors do not allow the speculative instructions to update the architectural register state until the speculation is confirmed to be correct. The register state of the speculative instructions is instead buffered in temporary registers. However, this solution is infeasible if the speculation (e.g. value predicting a load that has to access main memory because it missed the processor's caches) takes a long time to resolve. This is because a large number of instructions (on the order of hundreds) may be speculatively executed before the speculation is resolved. Consequently, the amount of register state to be buffered is prohibitive.
In such situations, the more feasible solution is to save (i.e. checkpoint) the architectural register state of the processor at the point of speculation, and to then restore this state if the speculation turns out to be incorrect. In the interval between the speculation and its resolution, the speculative instructions are free to update the architectural register state. However, checkpointing the architectural register state is a challenging technical problem for processors with large numbers of architectural registers (such as in processors that implement register windows). This is because it can potentially take a long time and a large amount of space to checkpoint and to restore the large architectural register state.
In order to save (i.e. checkpoint) the architectural register state of the processor at the point of speculation and then restore the state if the speculation turns out to be incorrect, the common technique is to copy the contents of all the architectural registers to memory or some temporary storage area at the point of speculation, and to then copy all the architectural register state from memory or the temporary storage area back into the architectural registers when the speculation is determined to be incorrect. However, this technique is inefficient if the architectural register state is large (e.g. in a processor that implements register windows) because the copy will either take many clock cycles to accomplish or require a high bandwidth and expensive copy mechanism.
This problem is further magnified if multiple checkpoints are required to facilitate nested speculative execution. Note that if the speculation takes a long time to resolve, the program is likely to encounter additional instructions where speculation is required. In order to keep executing the program in this situation, the system must be able to support nested speculation. In nested speculative execution, we speculatively predict results of additional instructions while the speculation on an earlier instruction is pending and has not been resolved. If a particular speculation is incorrect, the architectural state immediately prior to that speculation is restored. Any state maintained with subsequent speculations (such as checkpoints) may be discarded because execution is restarted from the incorrectly predicted instruction.
Hence, what is needed is a method and an apparatus for checkpointing the registers to support speculative execution without encountering the problems described above.