The present invention relates to a data processing device with a pipeline mechanism, featuring a pipeline bypass circuit for reducing a number of operation cycles.
Pipeline computers are already in use that execute several instructions in parallel. In pipeline processing, the operation process is executed by a single instruction which is then divided into several operation stages chronologically independent of each other. Each operation stage of an instruction is executed by a corresponding operation unit. In each instruction cycle, several operation units execute mutually different operation stages in parallel.
FIG. 16(a) illustrates a pipeline computer that executes an instruction by decoding the instruction. The pipeline computer reads its operand register in stage D, executes the first operation in stage E1, executes the second operation in stage E2, and writes the operation result into a register in stage W. The pipeline computer executes a preceding instruction 1 and a succeeding instruction 2 (with no mutual dependence between the instructions) usually in the sequence illustrated in FIG. 16(a).
When reading data from the source operand register of an instruction, a general pipeline computer should hold read processing until a preceding instruction that uses the same register as the target operand completes writing into the register.
For example, this type of computer executes two instructions continuously as follows:
Instruction 1: r1=r2+r3 PA1 Instruction 2: r4=r1+r5 PA1 Instruction 1: r1=r2+r3 PA1 Instruction 2: r4=r1+r5
Instruction 1 adds the contents of register r3 to register r2 and writes the result into register r1, while instruction 2 adds the contents of register r5 to register r1 and writes the result into register r4.
A computer with a simple pipeline mechanism executes the above two instructions by the sequence shown in FIG. 16(b). The succeeding instruction 2 can start the first operation at stage D only after the last operation at stage W of the preceding instruction 1 is completed.
If write data is confirmed by a cycle before write processing is completed, as in the above instruction string in FIG. 16(b), a source operand read path from the pipeline register is often set before completion of register write processing to minimize a number of waiting cycles. This path is called a bypass. Setting a bypass changes the instruction execution sequence as shown in, for example, FIG. 16(c). Compared to the sequence shown in FIG. 16(b) where no bypass is used, the processing speed is enhanced by one cycle.
FIG. 17 (prior art) illustrates a sample configuration of a conventional pipeline computer 228 using a bypass circuit. FIG. 17 (prior art) particularly illustrates a register file (100), pipeline registers (101, 102) for storing source operands, an operation unit (103) for the first operation, a pipeline register (104) for holding output from the operation unit (103), an operation unit (105) for the second operation, a register (106) for holding output from the operation unit (105), a bypass control circuit (107) for determining the possibility of pipeline bypass control under hardware control by holding the contents of several consecutive instructions, selectors (108, 109), register read lines (110, 111), a register write line (112), bypass lines (113-115), and select signal lines (116, 117).
The above two instructions are executed as follows:
In this prior art example, the write data is confirmed by the time that output from the operation unit (105) is set in the register (106) before the execution result of instruction 1 is written into the register file (100). Therefore, input to the selectors (108, 109) through the bypass line (115) hastens the source operand read cycle of instruction 2 by one cycle, as shown in FIG. 16(c), compared to processing without a bypass operation.
As to a floating-point operation, the operation unit (103) executes digit alignment between two operands and a post-alignment operation. The operation unit (105) executes post-operation normalization. The final result cannot be obtained until the execution of the operation unit (105) is completed. As to an integer operation, however, the final result can be obtained from the operation only at the operation unit (103). Inputting the output from the pipeline register (104) to the selectors (108, 109) through the bypass line (114) hastens the source operand read cycle of instruction 2 by two cycles as shown in FIG. 16(d), compared to processing without a bypass operation.
If an operation result is confirmed in the middle of processing at the operation unit (103) and a signal line (113) can output a corresponding signal read from operation unit (103), inputting the confirmed data to the selectors (108, 109) through the bypass line (113) executes the source operand read processing for instruction 2 without delay, as shown in FIG. 16(e).
For correct pipeline control under the conventional hardware control, data dependence between the instructions was analyzed and a pipeline register for storing write data was identified. Because this required complicated hardware control logic and large amounts of hardware, the design and testing of a pipeline computer took a great deal of time.