1. Field of the Invention
This invention relates to digital data processing systems, and particularly to a renaming scheme used for supporting out-of-order speculative execution of instructions.
2. Description of Background
There are many factors that affect performance and effective utilization of resources in modern processors. Some of these factors include data dependencies between instructions as specified by semantics of a program, a finite number of architected registers determined by an instruction set architecture (ISA), and an inability to disambiguate all memory references at compile time by limiting an amount of instruction-level parallelism (ILP) that is exposed through program binaries.
Several micro-architecture techniques are used in the processors to extract more ILP from programs. For example, a small number of architected or logical register names causes output and anti-dependencies between instructions in a program binary. These false dependencies affect processor performance because it forces serialized execution of instructions. Register renaming is a technique used by an instruction-scheduling unit of out-of-order superscalar processors to eliminate serialized execution of instructions due to output and anti-dependencies. Register renaming is a process of mapping a relatively small architected or logical register name space to a large physical register name space in order to enable out-of-order execution of multiple instructions in such a manner that they are only constrained by true data dependencies. Moreover, the register renaming logic with the help of some additional hardware is also used to enable speculative execution of instructions via maintaining multiple versions of logical registers (in different physical registers) and providing support to restore the processor state to an appropriate non-speculative state whenever the speculative execution turns out to be wrong.
Several hardware techniques have been proposed and used for implementing register renaming for out-of-order execution of instructions. These register-renaming techniques are broadly classified into two approaches. One approach is based on Tomalosulo's algorithm using reservation stations and the other approach is based on a mapping table. Additional structures such as reorder buffers, history buffers, future files, checkpoint/backup register files or shadow mapping tables may be added or combined with the basic renaming hardware to save and restore the architected register state for supporting speculative execution and precise interrupts.
In particular, the instructions in a program binary use a small set of logical (or architected) register names specified by the instruction set architecture (ISA) of the processor. The logical registers are mapped to a larger set of physical registers by the register renaming hardware. The physical registers used for storing the logical register values are organized as a single register file or two register files, those being an Architected Register File (ARF) and Rename Register File (RRF).
The register contents are copied from rename register file to architected register file (and marked free for renaming) whenever all the instructions referring to a particular architected register are completed. Note that, unlike in a compiler, the only way to know the end of a live range is whenever it becomes the target of a young instruction. However, in our renaming scheme, for the purpose of copying a rename register entry, it is assumed that a live range of an architected register ends whenever none of the in-flight instructions use it as a source register. To identify the set of “active” registers that will not be read by any in-flight instructions, a counter may be associated with each mapper table entry as described in the conference paper entitled “Register Renaming and Dynamic Speculation: an Alternative Approach” by Moudgill et al. published in the Proceedings of the 26th annual international symposium on Microarchitecture, 1993. The counter associated with a renamed physical register may be incremented at dispatch of every source register mapped to it and decremented whenever such an instruction completes. This counter-based physical register solution is an expensive solution that is not desirable. The exemplary embodiments propose detecting the end of “live-range” among the instructions in flight in order to reclaim a physical register in RRF used for renaming.
One of the common characteristics of all the prior register renaming schemes is that all of them use an associative search by using either CAM (Content Accessible Memory) structures or an array of parallel comparators on tables with large number of entries. The complexity and power consumed by the logic structures used for implementing such associative search functions is one of the major inhibitors for implementing high frequency out-of-order superscalar processors. Therefore a register-renaming scheme that does not involve any associative search functions is highly desirable.
In particular, it has been shown that out-of-order execution would help improve performance of a program, in particular SPEC INT benchmarks. However, having a separate mapper (register renaming unit) for each type of registers is expensive, making it unattractive for wide-issue high-frequency superscalar processor designs. The exemplary embodiments use processor architectures with a separate set of architected register files for each type of registers and threads along with a small set of physical registers using a shared mapper for renaming. As evident from the above discussion, conventional methods have limitations or inefficiencies, requiring a different approach to efficiently manage the renaming operations in an out-of-order superscalar processor.
Thus, it is well known that renaming registers affects effective utilization of resources in modern processors. Therefore, it is desired to develop an efficient method and apparatus for renaming registers that reduces hardware complexity as well as the number of rename registers by using multiple physical register files and avoiding associative searching.