1. Field of the Invention
The present invention relates to an art for making it possible to change the order in which instructions are executed in an information processing apparatus with architecture having register windows, which information processing apparatus employs the out-of-order instruction execution method. The changing in the order of instruction execution is performed, irrespectively of register window switching.
2. Description of the Related Art
Some of previous information processing apparatuses (RISC: Reduced Instruction Set Computer) employing architecture of a reduced instruction set type, have more than one register set (hereinafter called “register window”), thereby making it unnecessary to save/return registers, caused at the time of calling/returning of a subroutine, on a memory stack.
Register windows are linked in a ring-like form, and their management is performed using register window numbers (hereinafter called “window numbers”). For example, eight register windows with window numbers 0 through 7 are provided, and they are used in the order of 0, 1, 2, . . . , 7. The number of a window in use is held in a register [hereinafter called “CWP” (Current Window Pointer)] for holding a window number.
FIG. 5 is a diagram showing a construction of a ring-like register file A. In this register file A, register windows W0 through W7 are linked in a ring-like form with overlap there among. Each of the register windows W0 through W7 is formed by, for example, 32×64-bit registers. Of the 32 registers, eight registers (Global registers; not illustrated) are common to all the register windows W0 through W7. The 24 remaining registers are divided into three areas (“ins”, “locals”, and “outs” in the drawing) each including eight registers.
That is, taking the register window W0 as an example, eight registers W0ins of the left end overlap with the registers W7outs of the previous register window W7 and are common therebetween (that is, the registers W0ins also functions as the registers W7outs). Further, the eight middle registers W0locals do not overlap with any registers, and hold data of their own. Eight registers W0outs of the right end and the registers W1ins of the subsequent register window W1 are common to each other. In this instance, as in the case of the register window W0, in the register windows W1 through W7, eight registers of the ins register window and eight registers of the previous outs register window are common to each other; the eight middle registers do not overlap with any registers and hold data of their own; eight registers of the outs register window and eight registers of the subsequent ins register window overlap each other and are common to each other.
Such a register file A of FIG. 5 is called overlap register window.
In this register file A, register window switching instructions (hereinafter also simply called “switching instruction”), which are for switching the current register window W0 through W7 specified by the CWP, include a SAVE instruction which is for incrementing the CWP and a RESTORE instruction (RETURN instruction) which is for decrementing the CWP. Hereinafter, the register window switching instruction is called “SAVE instruction” and “RESTORE instruction”. Here, note that the CWP in FIG. 5 specifies the register window W0.
The register file A is an example in which the number of windows n=8 (register windows W0 through W7). The register file A has a total of 136 registers: 8 (registers)×8 (windows)=64 (registers) for the local area; 8 (registers)×8 (windows)=64 (registers) for the in/out overlap area; 8 registers for global registers (not illustrated). It is necessary to be able to read/write from/to all the 136 registers.
The speed and the size of the circuitry which reads data from such a large register file A have been issues to be solved, and thus, there has been developed an information processing apparatus 100 of FIG. 6. The information processing apparatus 100 includes: a Master Register File (MRF) 101; a Working Register File (WRF) 102; and an arithmetic operation unit 103, which has an execution unit (designated as “Execution unit” in the drawing) and a memory unit (designated as “Memory unit” in the drawing).
Generally speaking, a register window with a large number of windows needs a considerably large register file (eight windows need 136 registers), and thus it becomes difficult to supply operands to an arithmetic operation unit in high speed.
Therefore, as shown in FIG. 6, in addition to the MRF 101 which holds all the windows, a WRF 102, as a subset for holding a copy of data of the current register window specified by the CWP in the MRF 101, is provided. This WRF 102 supplies operands to the arithmetic operation unit 103. Since the WRF 102 only holds the window which is specified by the CWP, its capacity is 32 entries, which is smaller than that of the MRF 101. As a result, it is possible to enhance the speed of reading by the arithmetic operation unit 103.
However, in such a construction as of the information processing apparatus 100, the WRF 102 holds registers for a single register window, or the current register window specified by the CWP. In consequence, when an SAVE instruction or a RESTORE instruction is executed, it becomes impossible to supply operands necessary in the subsequent instruction from the WRF 102, which is a new problem.
As a result, when a SAVE instruction or a RESTORE instruction is executed, the WRF 102 must be replaced with the value of a new window. Thus, window (data) transfer processing from the MRF 101 to the WRF 102 is caused, and execution of instructions thereafter is stalled.
Further, in an information processing apparatus in which the order in which instructions are executed is changed under out-of-order instruction execution, instructions are executed beginning from executable instructions, irrespective of their order in the program. However, instructions which are to be executed subsequently to a SAVE instruction or a RESTORE instruction cannot be executed until the window after being subjected to the SAVE/RESTORE operations is transferred to the WRF 102, even if such instructions become executable.
Such limitation will considerably deteriorate the performance of information processing apparatuses which employ the out-of-order instruction execution method, in which apparatuses the number of instructions concurrently issued is large. In such information processing apparatuses employing the out-of-order instruction execution method, a great number of instructions are fetched and then accumulated in a buffer. Executable instructions are executed from the buffer in the order irrespective of the order in the program, whereby the throughput of instruction execution is improved.
Accordingly, the above-mentioned limitation that the order of execution of instruction cannot be changed when SAVE instructions and RESTORE instructions appear will cause a phenomenon that every when a SAVE instruction or a RESTORE instruction appears the out-of-order processing mechanism does not function, thereby causing significant deterioration in the performance.
In view of this disadvantage, as shown in FIG. 7, there has been developed an information processing apparatus 110 (for example, see the following patent document 1) in which the WRF 112 stores, in addition to data (G, L1, Io1, and Io2) of the current register window which is specified by the CWP, data (L2, Io3, L3, and Io4) of the register windows preceding and following the current register window (that is, registers of the register windows indicated by CWP+1 and CWP−1 are transferred beforehand). In this information processing apparatus, out-of-order execution is available with respect to instructions preceding and following SAVE instructions and RESTORE instructions.
In the above information processing apparatus 110, registers (here, 8 registers×8 byte) 113 for latching data therein are interposed between the MRF 111 and the WRF 112 when data is transferred from the MRF 111 to the WRF 112.
As shown in FIG. 8, the WRF 112 stores the contents of Ins of CWP (the same as Outs of CWP−1), Locals of CWP, Outs of CWP (the same as Ins of CWP+1), which are the registers of the current register window [here, the register window currently specified by the CWP is described as “CWP”, and the register window (the register window after incrementing the current CWP) following the CWP is described as “CWP+1”, and the register window (the register window after decrementing the current CWP) preceding the current CWP is described as “CWP−1”. In the following description, the register windows W0 through W7 are sometimes described as “CWP”, “CWP+1”, and “CWP−1” based on the relationship with the above CWP] specified by the CWP, and also stores Ins of CWP−1 and Locals of CWP−1, which are registers of a register window needed by the instruction after execution of a RESTORE instruction, and Outs of CWP+1 and Locals of CWP+1, which are registers of a register window needed by the instruction after execution of a SAVE instruction.
In consequence, as shown in FIG. 9, according to the information processing apparatus 110, When the CWP indicates the register window W3, The WRF 112 holds data of the register window W2 through W4 until the SAVE instruction is executed (see the double headed arrow line “E” in the drawing). Thus, the arithmetic operation unit 103 is capable of executing the instructions of the register windows W2 through W4. Here, in FIG. 9, the double headed arrow line “D” designates instruction decoding (Fetch, Issue cycle); the double headed arrow line “E” designates instruction execution (Dispatch, Execute, Update Buffer cycle); the double headed arrow line “W” designates completion of execution of an instruction (Write back; Commit cycle).
After completion of the SAVE instruction, the CWP specifies the register window W4, the information processing apparatus 110 transfers data of the register window W5 from MRF 111 to WRF 112 via the register 113. As a result, the WRF 112 stores data of the register window W3 through W5, and the arithmetic operation unit 103 resultantly executes instructions of the register windows W3 through W5.
In this instance, in the information processing apparatus 110, the arithmetic operation unit 103 is capable of executing an instruction of the register window W4 prior to the SAVE instruction, and is also capable of executing an instruction of the register window W2 prior to the RESTORE instruction.
However, in the previous information processing apparatus 110, since the WRF 112 holds data of three register windows, a total of 72 registers are necessary: eight registers 113 for latching and 64 registers for WRF 112. Thus, in comparison with the WRF 102 in the information apparatus 100 of FIG. 6, 40 registers must be added, thereby increasing hardware resources.
Accordingly, the area (circuitry area) for a selection circuit for reading out data to the WRF 112 and arithmetic operation unit 103 becomes large, and also, data read throughput from the WRF 112 by arithmetic operation unit 103 is delayed.
[Patent Document 1] Japanese Patent Application Laid-open No. HEI 2003-196086