1. Field of the Invention
The present invention relates to an arithmetic and logic processor such as a computer, and more particularly to a reduced instruction set computer.
2. Description of the Background Art
As one architecture of a VLSI (very large scale integrated circuit) processor, there is a scheme called reduced instruction set computer (hereinafter referred to as "RISC"). The RISC is a computer where among an instruction set of a computer in prior art, only basic instructions being high in the use frequency use are prepared through hardware implementation for high speed of the data processing. Various features of the RISC can be mentioned, and according to D. A. Patterson et al. in the article of "AVLSI RISC COMPUTER", IEEE, Computer, Vol. 15, No. 9, Sept. 1982, pp. 8 to 22, the features are as follows:
(1) one-cycle instruction PA0 (2) fixed length instruction format PA0 (3) load/store architecture PA0 (4) wired logic control PA0 (5) overlap register window PA0 (6) optimization of a pipeline by a compiler
The specific architecture of the RISC is described in detail in the above-mentioned reference. Now, the architecture of the RISC will be briefly explained.
In the RISC, in order to realize high speed processing, arithmetic or logic operation is limited between registers and the execution thereof is done within one cycle basically. In the RISC, since only the basic instructions being high in the use frequency use are prepared, constitution of internal control circuitry is simplified and margin is provided in the chip area. A register file is installed on the chip having the margin. A general-purpose register of the register file is used as the register for the operation. Only the load/store instruction is related to accessing to a memory. The instruction format is shown in FIGS. 1A through 1E.
Referring to FIG. 1A, the instruction format of standard type will be described. Any of the instruction format of standard type is set to have a constant bit length (for example, 32 bits), and includes an instruction code (7 bits) designating an operation content, SCC bit indicating whether a condition code (overflow, carry bit, zero bit, negative bit, extension bit or the like) is set or not, destination (5 bits) indicating the data transfer destination, a source 1 region (5 bits) designating the register being the data transfer source, an immediate instruction (constant or the like) IMM, and a source 2 region (13 bits) specifying any of a source register, an immediate and an offset.
When the IMM bit is "0", the instruction format is for an inter-register instruction, as shown in FIG. 1B. In this case, destination Rd selects one of the registers assigned to a procedure under execution as the transfer destination of the result of an operation executed in the registers specified by the sources Rs and S2. In the case that IMM is "0", the lower 5 bits of the source 2 region, S2, gives the information specifying a register.
When the IMM is "1", the instruction format is for an immediate instruction as shown in FIG. 1C. In this case, the source S2 represents a signed constant.
For an instruction requiring a memory access (memory access instruction) as shown in FIG. 1D, the bits S1 of the source 1, i.e., Rx specify an index register, and the bits S2 of the source 2 specify an offset. In this case, the destination register Rd or the source register Rm during storage operation is specified as the destination. Consequently, as the addressing mode the register indirect mode is mainly employed.
In the case of the relative jump instruction as shown in FIG. 1E, bits of the source 1, the IMM and the source 2 are combined, and specify the branch destination by using the combination as Y bits to specify a relative address of 19 bits, for example, of a program counter.
In the RISC as above described, only necessary basic instructions being high in the use frequency use are prepared, and each instruction word as shown in FIGS. 1A through 1E is unified to a fixed length of 32 bits for example, and each instruction is executed in one cycle. Consequently, constitution of the control system is simplified and the chip occupied area is significantly decreased.
An advanced control is adopted in the RISC, wherein prefetch of an instruction and execution of an instruction are overlapped in every cycle. In the advanced control of the RISC, the execution cycle of loading of two operands from general-purpose registers, the execution of operation and the store of an operation result in a general-purpose register is realized by a wired logic.
The RISC assumes programming by a high-level language. Since a structured programming with a high-level language is popularly used, the number of modules for constituting a program becomes large. As a result, times of the execution of a procedure read instruction increase. Since a complicated instruction is excluded in the RISC, when the same function as that of such complicated instruction is required, a subroutine is used. Such method causes increase of the number of procedure call.
The procedure call is performed usually by the steps as shown in FIG. 2. When some procedure calls a subroutine (SUB), if a parameter PARAmust be transferred to the called procedure (SUB), memory access is necessary. In this case, the storage address of a parameter to be delivered to the called procedure (SUB) is designated by an instruction DCA (PARA), and the parameter PARA is stored in the locations starting at the storage address. A memory access requires a longer time than that of an access to a register. In order to decrease the overhead, constitution of overlap register window is generally employed in the RISC. The overlap register window will now be described.
In the RISC, a number of general-purpose registers are installed, and a prescribed number of registers among the general-purpose registers are assigned to each procedure. A register group assigned to a procedure is called a register window.
FIG. 3A shows a constitution example of a register window which can be accessed by one procedure. The register window includes global registers at logic register numbers R0-R7, low registers at logic register numbers R8-R15, local registers at logic register numbers R16-R23, and high registers at logic register numbers R24-R31. The global registers store global variables used commonly in all procedures. The low registers and the high registers are regions where parameters are transmitted and received between procedures. The local registers are registers which an associated procedure uses dedicatedly. Now, operation will be described in the case that procedure A is executed and calls procedure B, and subsequently the procedure B calls procedure C as shown in FIG. 3B.
FIG. 4A shows specific constitution of a register file. As shown in FIG. 4A, the register file is constituted by global register regions of physical register numbers 0-7 and local register regions of physical register numbers 8-119. The local register regions of physical register numbers 8-119 are divided into unit regions (window) each comprising low register region, local register region and high register region, and a divided region unit is assigned to a procedure. Now assume that procedure A shown in FIG. 3B is under execution and uses the register file in the regions of the physical addresses (physical register number) 0-7 as global registers and the physical addresses 8-31 as local registers. When the procedure A calls procedure B, the procedure A stores arguments (parameter p1, . . . p6) to the register file (high a) of the addresses 24-31 and then calls the procedure B. The called procedure B uses the register file in the addresses 0-7 as global registers and the addresses 24-47 as local registers. Subsequently when the procedure B calls procedure C, the procedure B stores arguments (q1, . . . Q6) to the registers of the numbers (address) 40-47 and then calls the procedure C. The called procedure C uses the register file in the addresses 0-7 as global registers and the addresses 40-63 as local registers. That is, explaining the procedure B as an example, as shown in FIG. 4B, the high registers R24-R31 used by the procedure B are physically the same as the local registers R8-R15 used by the procedure C. Also the low registers R8-R15 used by the procedure B are physically the same as the high registers R24-R31 used by the procedure A. The transfer of parameters between the procedures can be performed on the registers being made physically the same. Consequently, the parameters need not be transferred to the memory, and the procedure can be called at high speed.
In the RISC as described above, when a procedure call occurs, save of a calling environment using a stack formed on the memory is not performed, but processing of the call is performed merely by moving of the register window. Moreover since the register window is partially overlapped, the overlapped part is used as registers for arguments to be delivered between procedures. In the example shown in FIG. 4B, the addresses 24-31 are commonly used by the procedure A and the procedure B, and a calling procedure stores the argument therein.
Return from a procedure is performed similarly, and the arguments for the return are stored in the overlapped part at the lower side of the register window, and subsequently the register window is moved towards the lower address. That is, for example, in the case of return from the procedure B to the procedure A, the argument for the return is stored in the low registers R8-R15 of the procedure B, and subsequently the register window is moved to regions of the addresses 8-31 of the register file. Thereby the return to the procedure A is performed.
As described above, in the call/return of the procedure, the transmission/reception of the parameters can be performed without accessing to the memory so that the call/return of a procedure can be performed at high speed. In the RISC, such call/return processing is supported by hardware, providing higher speed processing.
In the RISC also, the advanced control for processing the execution of instructions in pipeline is used often. Since the instruction word length or the instruction cycle is made uniform, the advanced control as shown in FIG. 5 is realized using a simple circuit constitution. That is, the prefetch of instruction, the instruction decode and the instruction execution are performed in an overlapping fashion. In general, a write cycle is performed after an execute cycle, and the operation result is stored in a register. However, the write cycle is omitted in the figure. In the advanced control, an execution cycle such as the load of two operands from the general-purpose registers, the execution of operation thereon and the store of the operation result in a general-purpose register is realized in a wired logic.
In the register file constitution shown in FIG. 4A, 120 general-purpose registers are prepared, and therefore six window registers can be installed. Consequently the overhead in saving a calling environment to the stack is not produced in the computer before the sixth procedure call. However, if the procedure call is produced six times or more, since an empty file does not exist in the register file, any save operation becomes necessary. When the call of the procedure becomes deep and is beyond the register file, a trap is usually generated. Data overflowing from the register file is saved to the memory by an OS (operating system). That is, in the RISC, when the register file is filled up and the register windows cannot be further produced (this state is called overflow of the register file), the trap routine for saving the oldest register window to the memory is started.
Even when the procedure return is produced, the case may occur that corresponding window does not exist in the register file (this state is called underflow of the register file), due to the saving of the register window to the memory. Also in this case, the trap routine for restoring the register window from the memory is started. In the starting of the trap, for example, a register monitoring the usage condition of the register windows is installed in a dedicated register. When the bit in that register is "1" indicating that the register window to be used next is being used, the routine of overflow trap or underflow trap is started and the save/load of the contents of the registers is performed.
In general, a memory has a slow operation speed in comparison to a CPU (central processing unit) of a computer. Consequently, in the RISC, when the overflow or the underflow of the register file occurs, the performance comparative to that of a conventional computer can be merely obtained due to the save of the variables (arguments) and the restore processing. That is, the high speed operation ability as one feature of the RISC is deteriorated. In general, the RISC is made on VLSI, in the case of VLSI, decrease of the chip size contributes to reduction of the cost. On the other hand, in order to prevent the overflow or the underflow of the register file, as many registers as possible must be installed on the chip. That is, the reduction of the cost and the improvement of the performance are in the relation of trade-off.