The present invention relates to an allocation technique for an architectural register in a system having one or more mapping tables that manage relations between architectural registers and physical registers.
A high performance processor, such as POWER7 (registered trademark) processor or z/Architecture (registered trademark) EC12 processor of IBM or Sandy Bridge processor of Intel, includes physical registers along with architectural registers visible to programmers and compliers; the number of physical registers (for example, 80 in z/Architecture, registered trademark, EC12 processor) being more than the number of architectural registers (for example, 16 in z/Architecture, registered trademark, EC12 processor). Such a processor increases parallelism among instructions and improves performance by executing operations in the processor using the physical registers outnumbering the architectural registers. Specifically, the processor reserves a physical register on a pipeline at the time of issuing an instruction and assigns the reserved physical register to an architectural register appearing in a destination operand of the instruction. The processor transfers the value of the physical register to the corresponding architectural register at the time of completing the instruction and then frees that physical register. This avoids false dependency between instructions arising from the reuse of architectural registers in a program and enables the processor to execute instructions out-of-order.
Mapping between architectural registers and physical registers is called register renaming and is performed by a register renaming mapper in a processor using a mapping table. One entry in the mapping table corresponds to one physical register. When the mapping table is full of the entries, the physical registers become unavailable. When the physical registers are unavailable, it is impossible to continue executing instructions placed in a pipeline and a pipeline stall occurs. This results in reduced performance.
In traditional processor designing, entries in a mapping table can be assigned to any architectural register. However, because of increased complication in processor designing in recent years, a processor has emerged that imposes a condition on an architectural register to which an entry is to be assigned. In the present specification an entry group having the same assignment rule imposed on architectural registers is referred to as “physical register management group.” The physical register management group can be regarded as a group that determines how the entries in a mapping table should be used.
An example of a processor includes two physical register management groups G0 and G1 to one mapping table. The physical register management group G0 manages entries in the first half of the mapping table and assigns them to architectural registers with the least significant bit 0. The physical register management group G1 manages entries in the latter half of the mapping table and assigns them to architectural registers with the least significant bit 1. Other than the above-described processor, various designed processors can emerge such as a processor that includes a plurality of mapping tables each having one physical register management group, the physical register management groups managing entries to be managed with mutually different assignment rules.
When a processor that includes a plurality of physical register management groups in the whole mapping table or tables is used, the pipeline stall problem is severe. For example, a case where one mapping table includes two physical register management groups, G0 and G1 as described above, is discussed below. In that case, if the architectural registers with the least significant bit 1 are frequently used in a sequence of instructions, the latter half of the mapping table managed by the physical register management group G1 is full of the entries and, although there is a vacancy in the entries in the first half, the physical registers become unavailable. As a result, the performance of the processor decreases.
Below are described literatures found in a prior art search for the present invention.
Japanese Unexamined Patent Application No. 2011-181114 discloses a technique of assigning the same actual register to the same virtual register and assigning different actual registers to mutually different virtual registers with respect to a plurality of virtual registers appearing in a program part and of assigning a register different from an actual register assigned to a variable having a live range extending across the program part in a source program.
Japanese Unexamined Patent Application No. 5-158707 discloses a technique of calculating utilization in which the usage efficiency of an actual register is converted into numerical form for each execution unit at the time of allocating an actual register on an object code to a virtual register on an intermediate code for each execution unit and setting the number of actual registers being allocation targets in accordance with the utilization.
Japanese Unexamined Patent Application No. 5-20089 discloses a technique of setting an actual register table that indicates usage conditions of actual registers that should be used in assembly processing and a virtual register table that indicates usage conditions of virtual registers to the actual registers and, when a register is specified in an assembler instruction, causing a processing device to search the virtual register table to check the usage conditions of the actual registers on the basis of information described in the virtual register table and causing the processing device to assign actual registers to be used and to perform processing such as saving or restoring a register value already set in an actual register.
Japanese Unexamined Patent Application No. 2011-18120 discloses a technique relating to an information processing device implementing a register renaming scheme for managing a plurality of physical registers coordinated with a plurality of logical registers in conjunction with a renaming table. In the technique, a dedicated instruction is incorporated into an instruction set so that a physical register coordinated with a logical register designated by the dedicated instruction is released to be free and an optimization is performed to change the number of software available registers within the plurality of logical registers and the number of renaming registers within the plurality of physical registers in conformity with the software executing the instruction set.
The Japanese Unexamined Patent Applications discussed above disclose the techniques relating to allocating physical registers. However, none of the techniques described in the literatures deal with a processor having a plurality of physical register management groups as the whole mapping table or tables and those techniques cannot reduce a decrease in performance caused by pipeline stalls occurring during execution resulting from a state where the physical registers become unavailable. “Coloring Heuristics for Register Allocation” by Briggs et al. 1989 is background art that discloses a method of determining live ranges of a plurality of registers.