Conventionally, in a pipelined processor, forwarding (FWD) control is used for improving processing performance. Forwarding control refers to the forwarding of data, from a stage (for example, a memory access (MEM) stage, or a write back (WB) stage) in which an execution result of a preceding instruction is outputted, to a stage in which succeeding instruction data is fetched (EX stage). Since this bypasses the data resulting from an execution of the preceding instruction, it becomes possible to solve or reduce data hazards.
FIG. 1 is a diagram showing a configuration of a conventional pipelined processor having a forwarding control mechanism. In the figure, bold lines mainly represent data, and thin lines mainly represent a control signal.
As shown in the figure, the conventional pipelined processor having a forwarding control mechanism includes: an instruction decoding unit 910, an instruction control unit 920, an instruction execution unit 930, and a register file 940. The instruction control unit 920 includes: a FWD control circuit 921, a register file write circuit 922, a pipeline buffer control circuit 923, an operation processing control circuit 924, and a memory access control circuit 925. The instruction execution unit 930 includes: an operation processing execution circuit 931, a memory access execution circuit 932, a FWD selector 933, a MEM selector 934, a pipeline buffer (EX) 935, a pipeline buffer (MEM) 936, and a pipeline buffer (WB) 937. The register file 940 includes a data holding unit 942 including plural registers (Reg#0 to Reg#N) managed by register number (#0 to #N). In addition, the pipeline includes: a decode (DEC) stage, an instruction dispatch and register fetch (ID) stage, an execute (EX) stage, a memory access (MEM) stage, and a write back (WB) stage.
First, the operation of the conventional pipelined processor at the time of execution of an instruction shall be described according to each stage of the pipeline.
In the DEC stage, an instruction decoding unit 910 generates instruction decoding information to be used after the ID stage, and outputs the instruction decoding information to the instruction control unit 920.
In the ID stage, input data for executing the instruction is generated by reading, in accordance with the instruction decoding information, register data from the register file 940, and the data is outputted to the pipeline buffer (EX) 935.
In the EX stage, in accordance with the instruction decoding information, either the operation processing control circuit 924 or the memory access control circuit 925 generates a control signal for instruction execution input data stored in the pipeline buffer (EX) 935, and causes the operation processing execution circuit 931 or the memory access execution circuit 932 to operate.
In addition, the pipeline buffer control circuit 923 opens the pipeline buffer (MEM) 936, and stores, in the pipeline buffer (MEM) 936, the execution result of the instruction outputted from the operational processing execution circuit 931, to perform operational processing.
In the MEM stage, in accordance with the instruction decoding information, the pipeline buffer control circuit 923 generates a selection control signal so that either the value of the pipeline buffer (MEM) 936 or the output of the memory access execution circuit 932 that is the execution result of an instruction to execute memory access is selected, and outputs the selection control signal to the MEM selector 934.
In addition, the pipeline buffer control circuit 923 opens the pipeline buffer (WB) 937, and stores the instruction execution result outputted from the MEM selector 934.
In the WB stage, in accordance with the instruction decoding information, the register file write control circuit 922 generates a write control signal for the register file 940, and writes the instruction execution result outputted from the pipeline buffer (WB) 937 into the register file 940; thereby data in the data holding unit 942 is updated.
Next, the forwarding control mechanism shall be described.
The FWD control circuit 921 judges whether or not each of the registers that are written in accordance with instructions executed as preceding instructions in the EX stage, MEM stage, and WB stage matches the register that is read according to the ID-stage instruction executed as a succeeding instruction. The operations are divided into the following (1) to (4) according to the result of the judgment.
(1) In the case where, as a result of the judgment, the register that is read according to the ID-stage instruction does not match the register that is written according to each of the EX-stage, MEM-stage, and WB-stage instructions, the FWD control circuit 921 generates a selection control signal so that register data having been read from the register file 940 is selected, and outputs the selection control signal to the FWD selector 933.
(2) In the case where the register that is read according to the ID-stage instruction matches the register that is written according to the EX-stage instruction, the instruction execution result for the register to be read out is not yet properly written; therefore, the FWD control circuit 921 suspends the pipeline.
(3) In the case where the register that is read according to the ID-stage instruction matches the register that is written according to the MEM-stage instruction, there is a path for forwarding the instruction execution result from the MEM stage to the ID stage; therefore, the FWD control circuit 921 generates, as input data for executing the instruction, a selection control signal so that the path for forwarding the instruction execution result from the MEM stage to the ID stage is selected, and outputs the selection control signal to the FWD selector 933.
(4) In the case where the register that is read by the ID-stage instruction matches the register that is written according to the WB-stage instruction, there is a path for forwarding the instruction execution result from the WB stage to the ID stage; therefore, the FWD control circuit 921 generates a selection control signal so that the path for forwarding the instruction execution result from the WB stage to the ID stage is selected, and outputs the selection control signal to the FWD selector 933. With this, forwarding is performed from the WB stage to the ID stage.
The pipeline buffer control circuit 923 opens the pipeline buffer (EX) 935 and stores the instruction execution input data outputted by the FWD selector 933.
Furthermore, an exemplary instruction sequence in which data is forwarded shall be described with reference to FIGS. 2A to 2D.
FIG. 2A shows an example of an instruction sequence having data dependency. In the figure, a preceding load (Id) instruction instructs to read data from memory, using the value of Reg#31 as an address, and to load the read-out data onto the Reg#0. A succeeding add instruction instructs to add the values of Reg#0 and Reg #1 and store the add result to Reg#2.
FIG. 2B is an example showing the timing of forwarding, particularly the forwarding from the MEM stage to the ID stage. The diagram shows the pipeline stages and execution cycles when the instruction sequence in FIG. 2A is executed. However, the stages prior to the DEC stage is omitted. The above-described load instruction is pipeline-processed sequentially in the ID stage, the EX stage, the MEM stage, and the WB stage in 4 cycles from t1 to t4, without generating a hazard. The above-described add instruction is processed through the four stages in 5 cycles from t2 to t5, generating a hazard in cycle t3. This is because in cycle t3 the execution of the preceding load instruction (to read the data to be stored in Reg#0 from memory) is not yet completed, and therefore the input data required for the execution of the succeeding add instruction (data to be stored in Reg#0) is not read. However, in cycle t3 (specifically in the latter half thereof), the input data from the MEM stage of the preceding load instruction to the ID stage of the succeeding add instruction (data to be stored in Reg#0) is forwarded from the MEM selector 934, through the FWD selector 933, to the pipeline buffer (EX) 935. With this, when the cycle proceeds from t3 to t4, the writing to the Reg#0 is not yet completed, but the succeeding add instruction can be transferred from the ID stage to the EX stage.
Note that the timing described in FIG. 2B is the same in the case where a simple instruction that does not include memory access (such as a NOP instruction, which instructs the register and the memory to perform no processing) is inserted between the load instruction and the add instruction in FIG. 2A. In this manner, the forwarding operation from the MEM stage to the ID stage is executed in the case of the instruction sequence shown in FIG. 2A and the case where a simple instruction is inserted, for the instruction sequence in FIG. 2A, between the load instruction and the add instruction. In addition, in the case where an instruction is inserted, no hazard is generated.
FIG. 2C is an example showing the timing of forwarding, particularly the forwarding from the WB stage to the ID stage. The diagram shows pipeline stages and execution cycles when, in FIG. 2A, two simple instructions that do not include memory access are inserted between the load instruction and the add instruction. In this example, the two instructions inserted between the load instruction and the add instruction are referred to as inst1 and inst2. The above add instruction is processed through the four stages in 4 cycles from t4 to t7. However, in cycle t4 (specifically in the latter half thereof), the input data (data to be stored in Reg#0) from the WB stage of the preceding load instruction to the ID stage of the succeeding add instruction is forwarded from the MEM selector 934, through the FWD selector 933, to the pipeline buffer (EX) 935. With this, in cycle t4, the succeeding add instruction can be transferred from the ID stage to the EX stage, even when the writing to Reg#0 is not yet completed. With this, when the cycle proceeds to t4 to t5, the succeeding add instruction can be transferred from the ID stage to the EX stage although Reg#0 cannot be read out. Thus, the forwarding operation from the WB stage to the ID stage is executed in the case where two simple instructions are inserted, for the instruction sequence shown in FIG. 2A, between the load instruction and add instruction, and where such an instruction sequence does not generate a hazard.
FIG. 2D is a diagram showing pipeline stages and execution cycles in the case where the instruction sequence in FIG. 2A is executed by a processor which does not have a forwarding mechanism. In this case, a hazard is generated for three cycles. This is because the succeeding add instruction reads Reg#0 in the ID stage after completion of the writing to Reg#0 in the WB stage of the preceding load instruction.    Non-Patent Reference 1: John L. Hennessy, David A. Patterson, “Computer Organization and Design: The Hardware/Software Interface 2nd edition (2),” issued by Nikkei Business Publications, Inc. Jun. 2, 2005 pp. 440-452