Processors of RISC (Reduced Instruction Set Computer) architecture (hereinafter, referred to as “RISC processors”) center on a register-register operation, and the acceleration of processing is attempted by reducing the memory access. This is referred to as a load store architecture. RISC processors are provided with a register file having a large capacity for the improvement of the above-mentioned register-register operation. As such a register file, a register file with register windows which is configured to reduce the overhead of passing an argument (saving/restoring of argument) when a subroutine is called is known.
FIG. 1 is a diagram illustrating the configuration of a register file with register windows.
The register file 1000 illustrated in FIG. 1 has one register window “W global” and eight register windows “W0-W7”, and the register windows W0-W7 are logically linked in a ring shape. The respective register window Wk(k=0-7) is provided with the three types of segments, “Wk out”, “Wk in”, and “Wk local”. Three types of segments all consist of eight registers. Wk local is provided with the eight local registers unique to the respective register windows. Wk in is provided with eight in-registers, and Wk out is provided with eight out-registers. Further, W global is provided with the eight global registers that are commonly used in all the subroutines.
Wk out is used for passing an argument to the subroutine called by the child routine, and Wk in is used for receiving an argument from the parent routine that called the child routine. A configuration is made in the register file 1000 such that Wk in and Wk+1 out as well as Wk out and Wk−1 in will overlap, and thus when a subroutine is called, the process of passing an argument and securing a register used therefor can be accelerated. Wk local is used by each subroutine, i.e., a child routine called by the parent routine, as a working register set.
Each subroutine uses one of the eight register windows W0-W7 at runtime. Here, the register window Wk used by the currently running subroutine (referred to as “current window”) is configured to rotate by two segments in a clockwise direction (the direction indicated by the broken-line arrow labeled “SAVE”) every time a subroutine call is caused, and to rotate by two segments in a counter-clockwise direction (the direction indicated by the broken-line arrow labeled “RESTORE”) when a subroutine is restored.
In the register file 1000, each register window Wk is managed with the respectively assigned register window number (hereinafter, referred to as “window number”). A window number “k” is assigned to register window Wk. The window number k of the register window Wk being used by the currently running subroutine is held by a CWP (Current Window Pointer). The value of a CWP is incremented by an execution of a SAVE instruction or by an occurrence of a trap, and is decremented by an execution of a RESTORE instruction or by a restoration from a trap by a RETT instruction. In FIG. 1, the value of the CWP is “0”, and the CWP points at the resister window W0. The instructions for switching the current window by incrementing/decrementing the value of a CWP, as described above, are referred to as “window switching instruction” in the present specification.
The register file 1000 illustrated in FIG. 1 has one window W global. The W global is a register set that stores the data shared in all the routines.
Each register window Wk is provided with 24(=8*3) registers, and the register window W global is provided with eight registers. Among those registers, 64(=8*8) registers of Wk in and Wk out overlap, and thus the total number of registers provided for the register file 1000 is 136(=24*8+8−64). In order for the functional unit of the processor to run a subroutine, it is necessary for the functional unit to be capable of reading and writing the data from/to all the registers of the register file 1000.
In such cases, the scale and speed of the circuit that reads the data from such a large register file 1000 becomes a problem. In order to solve this problem, the arithmetic processing unit as illustrated in FIG. 2 has been designed.
The arithmetic processing unit 2000 illustrated in FIG. 2 is comprised of a master register file 2001 (hereinafter, referred to as “MRF”), a working register file 2002 (hereinafter, referred to as “WRF”), an arithmetic section 2003, and a control section 2004.
Generally, a register file with register windows involves a larger number of registers as the number of the register windows increases, and it becomes accordingly difficult to provide an operand to the functional unit at high speed. For this reason, in addition to the MRF 2001 that is a register file for holding all the windows, the WRF 2002 is provided as a subset of the MRF 2001 for holding a copy of one of the windows of the MRF 2001 indicated by the pointer CWP, and this WRF 2002 performs the data reading. The WRF 2002 only holds the copy of the window indicated by the pointer. As the WRF 2001 is small compared with the MRF 2001, when the WRF 2001 reads the data depending on the provided READ_ADDRESS, which is the address of readout from the control section 2004, the process of data reading can be accelerated.
The arithmetic section 2003 is provided with the rename register ROB 2005 for the MRF 2001, and renames the computational result. Further, a write-back of the computational result is performed from the ROB 2005 to the MRF 2001 and the WRF 2002 when committed.
As described above, an arithmetic processing unit as illustrated in FIG. 2 has been designed; however, in such a configuration, there is a hardware cost due to the configuration in which a subset of the MRF is provided for holding the copy of one of the windows of the MRF. Furthermore, electric power is consumed due to the data transfer between the MRF and the WRF.
When a processor is provided with an out-of-order executive function, the order of executing the instructions is not necessarily in accordance with the program order, and the processable instructions are executed first. Therefore, it is also desired to achieve the configuration of swapping the execution order in which the register window switching instruction is passed.
Patent Document 1
    Japanese Laid-open Patent Publication No. 5-282147 “Register File”