1. Field of the Invention
The present invention relates to a pipeline control system, particularly to a pipeline control system in a data processing unit of the pipeline control type wherein the processing is carried out at a high speed when a branching instruction is executed.
2. DESCRIPTION OF THE RELATED ART
Data processing units of the pipeline control system type, which execute the processes of instruction fetch and execution of relevant instructions using pipeline control systems of this type are accompanied by problems concerning how to reduce the "waiting" condition time which is generated when needed data or instructions do not exist in the cache memory and concerning how to execute alternate processing smoothly when the instruction sequence is altered due to the establishment of a branching condition or an interruption.
The problem associated with the "waiting" condition can be solved to a certain degree by increasing the capacity of the cache memory, however, the problem related to alternate processing of an instruction sequence concerning how to quickly read the branching address instruction of a branching instruction still remains even with an increased size cache memory.
Meanwhile, when forming certain logical functions the recent trend to high density integration of logical circuits must be considered, especially when the number of logical elements increases, and if the hardware can reduce the number of input/output terminals of each logical block, performance of the system as a whole can be improved.
The prior art is first explained and the problems of the prior are discussed. FIG. 1 is a schematic profile of a pipeline operation in a data processing unit with pipeline control according to the prior art. In this figure, I1-I3 are pipeline stages for instruction fetch, while Pl-P6 are pipeline stages for execution of instructions, namely the operand fetch and calculation. The general operations of a data processing unit with pipeline control are explained with reference to FIG. 1. First, the heading address of a microprogram to be executed is loaded into the instruction address register (hereinafter referred to as IAR) 1A from a service processor (not shown) in stage Il of the pipeline of this data processing unit. In this case, since "0" is loaded into the instruction fetch constant register (hereinafter referred to as IFKR) 1B, the content of IAR 11 is directly loaded into the execution address register (hereinafter referred to as EAR) through the adder (A) 2A in stage I2. The cache memory 4 is accessed by EAR 3 and thereby relevant micro instructions are read out and loaded into the instruction word register (hereinafter referred to as IWR) at the end of stage I3.
After the first instruction is read as explained above, a fixed value of "8" is loaded into IFKR 1B from the instruction fetch control part (IFC) 1, and added to the content of IAR 1A by the adder (A) 2A, and the execution address is calculated and loaded into the EAR 3. As a result, the micro instruction read into IWR 5 is loaded in units of 8 bytes. The IWR 5 is generally composed of a multi-stage shift register and the boundary address of plural instructions stored in the IWR 5 can be obtained from a pointer register (not shown).
A selector (SEL) 6 selects the instruction to be executed in the pipeline in accordance with the address indicated by the pointer register
When the relevant instruction is selected by the selector (SEL) 6 in stage Pl of the pipeline, the operation code area of the instruction is shifted via the operation code registers P20P-P60P(8) corresponding to the stages P2-P6 and is used when the instruction is executed at respective stages.
The register designation area of the instruction is decoded, the general purpose register is accessed based on the decoded address and the base address and index value are respectively read into the base register (BR) 9 and index register (XR) 10. The displacement designation area of the instruction is loaded into the displacement register (DR) 11.
At stage P2, the content of the base register (BR) 9, the content of the index register (XR) 10 and the content of displacement register (DR) 11 are added in the adder (B) 12, as a result, the operand address is calculated and then stored in the address register (P3TAR) 13 of stage P3 and the operand fetch is carried out by accessing the cache memory 4.
A result of the operand fetch is read into the operand word register (OWR) 15 at the end of stage P4, and is calculated by the operation circuit 16 in the next stage P5. The result is loaded into the result register (RR) 17 at the beginning of stage P6 and is stored in the general purpose register 7 in stage P6.
At stage Pl mentioned above, the following operations are carried out in addition to instruction execution address calculation.
Namely, when the instruction execution address is calculated, a fixed value (for example, "8"), loaded into the instruction fetch constant register (IFKR) 1B, is accumulated by counter (CTR) 1C for each addition by the adder (A) 2A.
The operation code of the instruction extracted from the pipeline is detected by the instruction operation part register (P60P) in stage P6 and it is then sent to the instruction fetch control part (IFC) 1. As a result, the length of the instruction extracted from the pipeline (number of bytes) can be detected and it is subtracted from a value stored in counter (CTR) 1C by the adder (c) 2B. As a result, the counter (CTR) 1C contains the length of all instructions being processed in the pipeline and means that [IAR 1A]-[Counter(CTR}1C]=address of instruction in the stage P6.
The reason why this value is necessary will be explained later. When this operation is necessary, it is executed by subtracting a value of counter (CTR) 1C from IAR 1A in the adder (A) 2A.
The prior art has the feature that the instruction address register for reading the instruction and the instruction address register indicating the address of the instruction being executed in the pipeline are used in common.
Operations for general pipeline instructions are explained above and operations for branching instructions which are used in the present invention and how they are used in the prior art are explained below with reference to the time chart of FIG. 3.
When the instruction (indicated by N) read by IWR 5 is a branching instruction, the operand address calculated by the adder(B) 12 indicates the branching address The address is stored in the target address registers P3TAR-P6TAR during respective stages P3-P6 and then shifted. Simultaneously, the branching address is loaded into the EAR 3 through the OR circuit 2C at the stage P2 (corresponding to the stage Il in the case of the branching address instruction). Thereafter, the cache memory 4 is accessed to read the branching instruction (indicated by m, m+1, . . . ).
At the stage P4, the operation of the instruction (indicated as n-1), just preceding the branching instruction, is being carried out (for the instruction n-1, at the stage P5). If branching conditions are positively determined by the operation result, the branching address instruction being accessed is read into the IWR 5 and the branching instruction is executed at the respective stages of P1-P6.
In the time chart of FIG. 3, operations in the pipeline (at the stages P1-P6) of the branching address instruction (here, the 8 bytes read in this case are supposed to be two 4-byte instructions) are indicated by m and m+1.
In the prior art system, the branching address instruction m+2 can be read as follows: The branching address stored in the target register P6TAR in stage P6 of the branching instruction n is loaded into the IAR 1A through the OR circuit 19, a value of IFKR 1B ("8") is added in the adder(2) A at the stage Il of the branching address instruction, the execution address after 8 bytes from the address of branching address instruction m is loaded into the EAR 3 at the stage I2, thus the branching address instruction m+2 can be read by addressing the cache memory 4.
After all, in the prior art system, the processings are executed at the stages I1-I3 in order to read the m+2 branching address instruction after the processing of the branching instruction n at the stage P6. Thereafter, since the m+2 instruction is executed at the stages P1-P6, as is apparent from the time chart of FIG. 3, the execution is delayed by 3 cycles from the m+1 instruction.
The prior art has no problem in the execution timing of the branching address instruction which is precedingly read when the branching condition is established by executing the branching instruction, but does have a problem in the timing of reading the branching address instruction after successive 8 bytes from the cache memory (that is, after execution of branching instruction n reaches the stage P6, the stage Il of the branching address instruction m+2 starts).
As explained above, the branching address must be held by the target registers P3TAR-P6TAR until the stage P6 of the branching instruction n because it is checked concerning whether there is interruption or not during execution of the branching instruction n in the stage P6 and the operation result is also checked and accordingly, the content of IAR 1A cannot be changed at least until the branching instruction n is extracted from the pipeline.
For example, when considering that an error is detected in the stage P6, execution of the branching instruction n is invalidated and therefore, the instruction n must be retried and it is essential to obtain the retry address from IAR 1A.
The disadvantageous points of the prior art system can be summarized as follows. A time delay is generated for reading the instruction after 8 bytes at the branching address because the instruction address register for reading the instruction and the instruction address register indicating the instruction being executed in the pipeline are used in common, and the address of the branching instruction must be held by some means until the branching instruction has completed execution even after the branching instruction is branched because it is probable that an interruption or error may be generated during the pipeline processings.
The route from the result register (RR) 17 to the OR circuit 19 is used, for example, when branching is carried out by loading the PSW (program status word) instruction.