1. Technical Field
The present application relates generally to an improved data processing system and method. More specifically, the present application is directed to a universal register rename mechanism for instructions with multiple targets in a microprocessor.
2. Description of Related Art
Register renaming is a common technique in microprocessor design used to increase performance by allowing instructions to execute out of order. Programs are composed of instructions that operate on values. The instructions must name these values in order to distinguish them from one another. A typical instruction might say, for example, add X and Y and put the result in Z. In this instruction, X, Y, and Z are the names of storage locations. In order to have a compact instruction encoding, most processor instruction sets have a small set of special locations that can be directly named. In smaller processors, the names of these locations correspond directly to elements of a register file.
Different instructions take different amounts of time. For instance, a processor may be able to execute hundreds of instructions while a single load from main memory is in process. Shorter instructions executed while the load is outstanding will finish first; therefore, the instructions are finishing out of the original program order. Out of order execution has been used in most recent high-performance CPUs to achieve some of their speed gains.
Consider this piece of code running on an out-of-order CPU:
1. Load register 1 from memory location 1024
2. Add the number 2 to register 1
3. Store register 1 to memory location 1032
4. Load register 1 from memory location 2048
5. Add the number 4 to register 1
6. Store register 1 to memory location 2056
Instructions 4, 5, and 6 are independent of instructions 1, 2, and 3, but the processor cannot finish 4 until 3 is done, because 3 would then write the wrong value.
Register renaming can eliminate this restriction by changing the names of some of the registers:
1. Load register 1 from memory location 1024
2. Add the number 2 to register 1
3. Store register 1 to memory location 1032
4. Load register 2 from memory location 2048
5. Add the number 4 to register 2
6. Store register 2 to memory location 2056
Now instructions 4, 5, and 6 can be executed in parallel with instructions 1, 2, and 3, so that the program can execute faster.
When possible, the compiler performs this renaming. The compiler is constrained in many ways, primarily by the finite number of register names in the instruction set. Many high performance microprocessors rename registers in hardware to expose additional parallelism.
Typically, a different rename structure is required for each destination type in the microprocessor. For example, the general purpose register (GPR) fixed point destination will require a rename structure that is different from floating point register (FPR) destinations. All these rename structures and logic are usually very complicated and costly in terms of power and silicon.