1. Field of the Invention
The present invention relates to a high performance computer using a reduced number of commands to be executed.
2. Description of the Background Art
A computer realizes various processes by converting an arbitrary procedure programmed in software into a series of commands described in the command language executed by the computer, and executing this series of commands. The MIPS R3000 computer (hereinafter called R3000) developed by MIPS is a known computer. The R3000 is specifically described in "mips RISC ARCHITECTURE" by Gerry Kane.
FIG. 14 is a block diagram showing the structure of a central processing unit (CPU) of the R3000. In the diagram, reference numeral 1 is a register file, 2 is an arithmetic and logic unit (ALU), 3 is an address adder, 4 is a program counter, and 5 is a control circuit (including a command decoder). Shift circuits, multiplying circuits and all others relating to operation are included in the ALU 2.
Reference numerals 6a and 6b are stored data of registers readout from the register file 1, 7a and 7b are immediate values outputted from a control circuit, 8 is an operation result of the ALU, 9 is output data of a program counter or an address value of a command memory, CMD is an output signal of a command memory or a command for operating this computer, and 11 is an operation result of an address adder or an address value of a data memory. Reference numeral 18 is address data showing the number of a write register described in a command word, and 16a and 16b are address data showing the numbers of read registers described in each command word.
The control circuit 5 receives a command CMD, and outputs control data 7a and 7b to the ALU 2 and address adder 3 according to the command CMD, and also outputs read register address signals 16a and 16b and a write register address signal 18 to the register file 1.
The register file 1 outputs the register data 6a to the ALU 2 and address adder 3 according to the read register address signals 16a and 16b and the write register address signal 18, and also outputs the register data 6b to an external memory (not shown). If a write register is instructed by the write register address 18, the ALU operation result 8 is written into the write register.
The ALU 2 adds the register data 6a and control data 7a, and outputs the result of the addition or the ALU operation result 8 to the register file 1.
The address adder 3 adds the register data 6a and control data 7b, and outputs the result of the addition or the address addition result 11 to the external memory. This address addition result 11 is the address to be accessed by the external memory.
The program counter 4 sequentially increases and outputs the program count value 9 when the control signal 23 is "L".
The computer R3000 interprets the command 10 having read the program count value 9 as an address value in the control circuit 5, generates a necessary signal, and gives the signal to the corresponding blocks to execute the processing.
FIG. 15 is an explanatory diagram showing an internal constitution structure of the register file 1. As shown in the diagram, memory cells MC are formed in a matrix, and memory cells MC of each line are registers R0 to R31 of $0 to $31. Access to these memory cells MC is effected under control of decoding circuits 600 to 602.
The decoding circuit 600 receives a write register address signal 18, and selectively asserts plural write register selection lines 19 according to the write register address signal 18. The decoding circuit 601 receives the read register address signal 16a, and selectively asserts plural read register selection lines 20a according to the read register address signal 16a. The decoding circuit 602 receives a read register address signal 16b, and selectively asserts a read register selection line 20b according to the read register address signal 16b.
FIG. 16 is a circuit diagram showing an internal structure of a memory cell MC. As shown in the diagram, a memory unit 21 includes a loop connection of inverters G1 and G2, and an NMOS transistor Q1 is inserted between the input unit of the inverter G1 and a write signal line L8 in which one-bit information of ALU operation result 8 is obtained. The output of the inverter G1 is connected to the input of an inverter G3, and an NMOS transistor Q2 is inserted between the output unit of the inverter G3 and a register data line L6a in which one-bit information of register data 6a is outputted. The output of the inverter G1 is also connected to the input of an inverter G4, and an NMOS transistor Q3 is inserted between the output unit of the inverter G4 and a register data line L6b in which one-bit information of register data 6b is outputted. To the gate of the transistor Q1, a write selection line L1 is connected, and a read selection line L2a is connected to the gate of the transistor Q2, and a read selection line L2b is connected to the gate of the transistor Q3. To the write selection line L1, the write register selection line 19 is connected, and the read register selection line 20a is connected to the read selection line L2a, and the read register selection line 20b is connected to the read selection line L2b.
Therefore, when the write register selection line 19 becomes H, the transistor Q1 is turned on, and the one-bit information of the ALU operation result 8 obtained from the write signal line L8 is written into the memory unit 21; when the read register selection line 20a becomes H, the transistor Q2 is turned on, and the stored data in the memory unit 21 is outputted as register data 6a through the data register line L6a; and when the read register selection line 20b becomes H, the transistor Q3 is turned on, and the stored data of the memory unit 21 is outputted as register data 6b through the register data line L6b.
In this register file 1, the write register selection line 19 is selectively asserted based on the write register address signal 18, and the ALU operation result 8 is written into the memory cell MC of register Ri (i=0 to 31) connected to the write register selection line 19 when asserted; the read register selection line 20a is when asserted the read register address signal 16a, and register data 6a is outputted from the memory cell MC of register Ri connected to the read register selection line 20a when asserted; and the read register selection line 20b is asserted based on the read register address signal 16b, and register data 6b is outputted from the memory cell MC of register Ri connected to the read register selection line 20b when asserted.
In this way, the register file 1 includes a data memory device group in a bit width that can be processed by a processor existing inside the computer. In the case of the register file 1 of the R3000, there are 32 registers of 32-bit width, and they are numbered from $0 to $31 to be distinguished. In the specification, hereinbelow, the x-th register in a group of 32 registers is expressed as $x.
In a recent computer, these register groups can freely read and write data, but in the case of the R3000, in particular, the register $0 is specified as a zero register, in which data cannot be written and, when reading, always the value 0 is read out, as determined in the hardware.
Incidentally, $31 is called a link register, which is specified as a register in which the data of the return destination address is stored when restoring after execution of a branch command. Besides, $29 is specified as a register for stack pointer, and this is a register for storing the address value on an external memory element for saving the data in the register in advance, in case of change of the value of the register in the CPU before the command at the branch destination is branched in the case of execution of the branch command.
However, $31 and $29 are determined in the manner of use only by type, and can be used for storing other data, and no problem is caused if used otherwise. Accordingly, for the use of $29, it is necessary to specify its number in the command word. As for $31, basically, it is necessary to specify its number in the command word, but it can be also used without specifying in the command word.
Unlike the R3000, there are other computers in which the registers can be used for specific commands only. In such computers, the read and write registers are determined by the commands, the name of registers is not particularly described in the command word.
In a recent computer, software called a compiler is essential, and the existing computers of high speed and high performance are realized by the hardware of a computer and the software of a compiler. The compiler is a kind of software which shuffles the commands in an order easy to execute by the computer, or makes efficient use of the operation units of the computer. For the compiler, the efficiency is high when the computer incorporates many registers so that the data can be moved among registers of the computer. Therefore, instead of limiting the registers by commands, it is convenient if various registers can be utilized by various commands. In most recent high performance computers, hence, registers are not fixed by commands.
In a computer having registers fixed by commands, the degree of complexity increases, and only up to eight registers can be installed, and in computers having 16 or more registers (32 in most recent ones), registers are not fixed by commands. Accordingly, the number (or name) of the register to be used is specified in the command word. Designation of register by command is more advantageous for a semiconductor integrated circuit because the register file can be composed of memory elements or the like.
The address adder is an adder for calculating the address of an external memory element when executing a load command (a command of reading data of an external memory element of CPU into a register in the CPU) or store command (a command of writing data of a register in the CPU into an external memory element of CPU). Besides, the ALU, program counter, and control circuit are ordinary circuits necessary in a computer, and are not particularly explained herein.
Described below are examples of execution of various commands by the R3000 representing the conventional computers. FIG. 17 shows command word and machine language of a command for executing a call-return statement (a command of branching into arbitrary series of commands and returning to an original series of commands) by the R3000. A machine language is an expression of a command word in a series of binary numerals so as to be understood by the computer, and in a recent high performance computer, one command has a fixed length of 32 bits.
In FIG. 17, command 1 is calling a series of commands A (composed of command 10 to command 16) by using the command "jal". When processing of the series of commands A is over, returning to command 2, then command 2 is executed. Such processing is called call-return process. In command 1, the address value of command 10, or 10 in this case, is set in the program counter. In the machine language of command 1, the first six bits express the branch command "jal", and the subsequent five bits and next five bits and final 16 bits express the branch destination address 10. This "10" is set in the program counter 4.
In the "jal" command, it is necessary to return to the next command after completion of the execution of the series of commands A at the branch destination, and the address value of command 2, that is, 2 is set in the register $31 tacitly in the hardware. In the command word or machine language, nothing is required to be specified. In the "jal" command, it is tacitly known that the value adding 1 to the present program count value (1 showing the address of command 1) (that is, 2 showing the address of command 2) is written in $31, so that it is not necessary to describe particularly in machine language. It corresponds to the special case of using the register $31 without specifying it after the command.
The computer executes command 10 after command 1. Command 10, "addi", is an addition command, and it shows that the stored data of register $29 is added with (-6) and written into register $29. In the machine language of command 10, the first six bits express the command of "addi", the subsequent five bits represent register $29 to write in, next five bit, register $29 to read out, and final 16 bits, addition data (-6). In the ordinary "addi" command, the read register and write register are different, and two registers must be specified, and in this case, too, the register $29 must be specified twice in the machine language.
Command 11 is a store command, which means the stored data of register $31 is stored in an address of an external memory of the CPU, and the address herein is the summed value of the stored data of register $29 and 5. That is, the command 13 to be executed later is "jal" command same as command 1, and the address of return destination is unconditionally written in the register $31, and therefore this command occurs because it is necessary to save the stored data of the present register $31 in the external memory. In the machine language of command 11, the first six bits express the store command, the next five bits denote the register number $29 necessary for address calculation, the subsequent five bits represent the register number $31 hold the data to be stored outside, and the final 16 bits refer to data 5 to be added.
In this way, in the computer, in order to save the data of a register to an external memory, a register for specifying the address of a vacant space on the memory to save in is required, and it corresponds to register $29 in the R3000. In this example, only the stored data of register $31 is saved, but the number of registers to be saved is arbitrary depending on the number of registers to be used by the series of commands.
It may be considered that command 12 and command 13 are intrinsic commands of the series of commands A, while command 10 and command 11 are preparations for executing command 12 and command 13. Command 13 is "jal" command, and after executing the necessary processing after branching, it returns to command 14. Command 14 is a load command, and the value adding 5 to the stored data of register $29 is used as the address number for reading the data from the external memory, and is transferred and written into the register $31. That is, so the series of commands A may return to the called command, the return destination address is set in the register $31. Command 15 means that the stored data of register $29 and 5 are summed up. It means to return to the stored data of register $29 before calling the series of commands, that is, to return the vacant space to the initial state.
Command 16 instructs that the address value indicated by register $31 is set to the program counter, and branched. In this example, the stored data value of the register $31 is 2, that is, the address value of command 2, and control branches to command 2. That is, control returns to the next command after command 1 which called the series of commands A.
Below is specifically described the processing method of command 10 and command 11 in FIG. 17 by the conventional computer shown in FIG. 14.
1) When processing the "addi" command of command 10 in FIG. 17
The control circuit 5 decodes the command CMD, sets "11101" as write address signal 18, and "11101" as the read register address signal 16a, and sets the immediate value -6 to be added as control data 7a. The register file 1 outputs the stored data (D29) of register $29 as the register data 6a, on the basis of the read register address signal 16a and write register address signal 18, and specifies the register $29 as write register. Consequently, the ALU 2 outputs the ALU operation result 8 obtained by adding the register data 6a (D29) and control data 7a ("-6") to the register file 1, so that the ALU operation result 8 is written in as the stored data value of register $29 in the register file 1.
2) When processing "sw" command of command 11 in FIG. 17
The control circuit 5 decodes the command CMD, sets "11101" as the read register address signal 16a, and "11111" as the other read register address signal 16b, and sets the immediate value 5 to be added as control data 7b. The register file 1, on the basis of the read register addresses 16a and 16b, the stored data value (D29) of register $29 is added to the address adder 3 as register data 6a, and the stored data value (D31) of register $31 is outputted to the external memory as register data 6b.
The address adder 3 outputs the address addition result 11 obtained by adding the register data 6a (D29) and control data 7b ("5") to the external memory. As a result, the value stored in register $31 is written in the external memory having an address equal to the address addition result 11.
FIG. 18 is a diagram showing the command for executing the "for" statement by the R3000 (the command repeatedly executing the same command row a specified number of times), together with the command code and its machine language. FIG. 18 shows an example of "for" statement of repeating the command instructed by command 3 by 1024 times.
First, command 1 sets 1024 as the number of repetitions in the first register. In command 1, "addi" is an addition command, and it shows that 1024 is added to the stored data value of register $0 to be written in $30. In register $0, the value is always 0 as mentioned above. In the machine language of command 1, the first six bits represent the addition command "addi", the next five bits denote register $30, the next five bits denote $0, and the final 16 bits express 1024.
In command 2, similarly, as the stored data of register $28, "1" is set as the meaning of the first of repetition. Command 3 is an intrinsic command (processing) to be repeatedly executed by the "for" statement, and there is only one command 3 in this example, but about 64,000 (216) commands are possible.
In command 4, after executing necessary processing (command 3 herein), the value 1 is added to the stored data value of register $28, and it is set as the stored data value of register $28. The read register and write register are same, and therefore the command may be described as shown in parentheses. In the machine language, however, since it is the same command as ordinary "addi" command, the first six bits represent the addition command "addi", the next five bits represent the read register $28 the next five bits express the write register $28, and the final 16 bits express addition 1. Thus, in the machine language, $28 must be described twice.
In command 5, the stored data value of register $28 and stored value of register $30 are compared, and if not coinciding, it means, to return to the process two addresses back of the presently running address (that is, address of command 3). In the machine language of command 5, same as the machine language of command 1, the first six bits express the comparison branch command bne!, the next five bits, the stored data value of register $28, the next five bits, the stored data value of register $30, and final 16 bits, -2.
In command 5, until the stored data value of register $28 becomes 1024, the stored value of register $28 and stored value data $30 do not coincide. In command 4, the stored data value of register $28 increase by one each, and therefore the stored data value of register $28 does not reach 1024 unless command 4 is executed 1024 times from command 3. When the stored data value of register $28 becomes 1024, in command 5, the stored data value of register $28 and stored data value of register $30 coincide, so that the processing is transferred to command 6 without branching to command 3. Hence, repeated command of "for" statement is executed.
FIG. 19 is a block diagram showing a specific structure of the R3000 shown in FIG. 14. As shown in the diagram, the R3000 is provided with a comparator 100 for branch command control. The comparator 100 receives register data 6a and register data 6b, and outputs the comparison result of the register data 6a and register data 6b to the program counter 4 as a comparative result signal 101.
The control circuit 5 usually outputs a control signal 23 of "L", and when the command CMD instructs "bne" command, the control signal 23 of "H" is outputted to the program counter 4.
The program counter 4 receives the control signal 23, control data 7b, and comparative result signal 101, and outputs the value adding the control data 7b to the present program count value 9 as a new program count value 9 when the control signal 23 is "H" and the comparative result signal 101 indicates disagreement, and adds 1 to the program count value 9 and outputs as a new program count value 9 when the control signal is "H" and the comparative result signal 101 indicates agreement. The program counter 4 sequentially increases the program count value 9 and outputs when the control signal 23 is "L".
The processing method of command 4 and command 5 in FIG. 18 by the R3000 shown in FIG. 19 is described below.
1) When processing "addi" command of command 4 in FIG. 18
The control circuit 5 decodes the command CMD, and sets "11100" as the write address signal and "11100" as the read register address signal 16a, and sets the immediate value 1 to be added as the control data 7a. The register file 1 outputs the stored data value (D28) of register $28 as the register data 6a, on the basis of the read register address signal 16a and write register address signal 18, and specifies register 28 as the write register. Consequently, the ALU 2 outputs the ALU operation result 8 obtained by summing up the register data 6a (D28) and control data 7a (1), and therefore the ALU operation result 8 is written as the stored data value of the register $28 in the register file 1.
2) When processing "bne" command of command 5 in FIG. 18
The control circuit 5 decodes the command CMD, and sets "11100" as the read register address signal 16a, and "11110" as the other read register address signal 16b, and sets the immediate value -2 added to the program count value 9 as control data 7b. The control circuit 5 outputs the control signal 23. The register file 1 outputs the stored data value (D28) of register $28 as the register data 6a on the basis of the read register address signals 16a and 16b, and outputs stored data value (D30) of register $30 as register data 6b. The comparator 100 compares the register data 6a (D28) and register data 6b (D30), and outputs the result of comparison to the program counter 4 as comparative result signal 101.
As a result, the program counter 4 which receives the control signal 23 of "H" outputs the value ("3") adding control data 7b ("-2") to the present program count value 9 ("5") as a new program count value 9 when the comparison result signal 101 indicates disagreement, and adds 1 to the present program count value 9 ("5") and outputs as a new program count value 9 ("6") when the comparative result signal 101 shows agreement.
In this way, in the conventional computer such as the R3000, when performing call-return process, as shown in FIG. 17, the called series of commands 10 was always requested to execute command 10 and command 11. Similarly, in the case of loop processing, as shown in FIG. 18, it was required to execute the loop control commands, that is, command 4 and command 5.
That is, every time the call-return process or loop command is executed, always two commands must be executed regardless of the content of the command to be executed, and it was inefficient.