(1) Field of the Invention
The present invention relates to a pipeline processor system including a data memory device and a pipeline processor.
(2) Description of the Related Art
A pipeline architecture improves efficiency of a microprocessor by permitting a number of sequential instructions to be in various execution stages simultaneously.
In the pipeline architecture, however, hazards arise from resource conflicts when the processor cannot support memory access instructions in simultaneous overlapped execution; and they reduce the performance from the ideal speedup gained by pipelining.
For example, it is assumed that a processor implements a four-stage pipeline for instruction execution. The four stage are: instruction fetch, decode/operand fetch, execute, and write-back. A resource conflict possibly occurs when the processor attempts to execute an instruction in decode/operand fetch stage while another instruction in write-back stage is in progress.
Thus, resource conflicts occur when the processor wants to perform plural memory accesses in single clock cycle. To be concrete for the above four-stage pipeline processor, hazards possibly occur in three combination of instructions in simultaneous overlapped execution: 1) an instruction in the instruction fetch stage and an instruction in the operand fetch stage, 2) an instruction in the instruction fetch stage and an instruction in the write-back stage, and 3) an instruction in the operand fetch stage and an instruction in the write-back stage. The hazards cause stalls in the pipeline processor.
The stall in the processor caused by the overlapped execution of the first and the second combinations of instructions can be prevented by duplication of resources. If the processor employs separate instruction and data memories, two memory accesses can be performed in a clock cycle.
The stall caused by the third combination, however, requires that some instructions be allowed to proceed, while others are delayed. To be precise, an early instruction in its write/back stage is allowed to proceed, while an instruction in its decode/operand fetch stage is delayed. In this case, the pipeline will stall an instruction in the decode/operand fetch stage until the required unit is available ("Computer architecture A Quantitative Approach, pp. 257-278, 1998, Morgan Kaufmann Publishers, Inc.")
Thus, the stall caused by the third combination cannot be prevented even by the duplication of resources; and the stall in the pipeline degrades the pipeline performance from the ideal one. To be precise for the above four-stage pipeline processor, the pipeline must stall for one clock cycle when the resource conflicts happen. Consequently, the execution cycle becomes five clock cycles including one clock cycle pipeline delay, though it has four clock cycles without the resource conflicts.