1. Field of the Invention
The present invention relates to boosting control technology for use with a pipeline processor apparatus.
2. Description of the Related Art
To allow a computer system to execute instructions at high speed, it should be controlled so that the idle state of an instruction pipeline of a processor is as short as possible, and that the instruction pipeline is not stalled. In particular, a VLIW (Very Long Instruction Word) system and a super scalar processor are provided with hardware that executes a plurality of operations at a time. To effectively use this hardware, operations to be executed should be supplied to the processor. To satisfy such a requirement, a compiler that generates execution code for the processor analyzes the dependency of data for a corresponding software and changes the sequence of the operations according to the analyzed result, without changing the purpose of the software, so as to optimize the execution code.
To further improve the operating efficiency of the hardware, a technology called boosting has been employed. This technology will be described, next.
FIG. 1 is a schematic diagram for explaining a fundamental pipeline process that a pipeline processor executes.
In FIG. 1, instructions A to D are executed at each instruction cycle by an instruction fetch unit, a register read unit, an arithmetic and logic unit (ALU), a memory access unit, and a register write unit (these are not shown).
At the instruction cycle 1, an instruction fetch stage (referred to as the F stage) for the instruction A is executed by the instruction fetch unit.
At the instruction cycle 2, a register read stage (referred to as the R stage) for the instruction A is executed by the register read unit. In addition, the F stage for the instruction B, which follows the instruction A, is executed by the instruction fetch unit.
At the instruction cycle 3, an arithmetic and logic operation stage (referred to as the A stage) for the instruction A is executed by the arithmetic and logic unit. The R stage for the instruction B is executed by the register read unit. In addition, the F stage for the instruction C, which follows the instruction B, is executed by the instruction fetch unit.
At the instruction cycle 4, a memory access stage (referred to as the M stage) for the instruction A is executed by the memory access unit. The A stage for the instruction B is executed by the register read unit. The R stage for the instruction C is executed by the register read unit. In addition, the F stage for the instruction D, which follows the instruction C, is executed by the instruction fetch unit.
At the instruction cycle 5, a register write stage (referred to as the W stage) for the instruction A is executed by the register write unit. The M stage for the instruction B is executed by the memory access unit. The A stage for the instruction C is executed by the arithmetic and logic unit. The R stage for the instruction D is executed by the register read unit. In addition, the F stage for an instruction E (not shown), which follows the instruction D, is executed by the instruction fetch unit.
In the pipeline process composed of five stages of the F stage, the R stage, the A stage, the M stage, and the W stage, it is assumed that the operation result for each instruction can be used after the W stage for the instruction is finished. In addition, it is assumed that a branch condition designated by a branch instruction is determined in the R stage.
Next, as example 1, the execution of an instruction sequence shown in FIG. 2 will be described.
In this example, to determine a branch instruction "br r2,0,L1", the operation result of an operation instruction "add r2,r3,r4" that operates on the value of a register r2, is required. Thus, as shown in FIG. 3, the R stage for the branch instruction cannot be executed until the W stage for the operation instruction is finished. Thus, an idle stage takes place in the pipeline process (in this state, an instruction is not executed). In FIG. 3, when the instruction sequence is executed, a non-operation instruction (referred to as a nop instruction) is inserted in the instruction sequence by an interlock portion of the hardware. Alternatively, when the compiler executes the instruction sequence, it places the non-arithmetic instruction in the instruction sequence.
In the state shown in FIG. 3, since the nop instruction is inserted, the execution speed of the entire pipeline process decreases. A technique that prevents the execution speed from decreasing is known. In this technique, the compiler moves an instruction to be branched to the position that just precedes a branch instruction, so as to execute the instruction to be branched without a need to wait until the execution of the branch instruction. In addition, when a branch condition determined as the result of the execution of the branch instruction is satisfied (taken), the execution result of the instruction to be branched that has been executed is validated. When the branch condition is not satisfied (not taken), the execution result of the instruction to be executed that has been executed is cancelled. This technique is called the boosting. An instruction that is temporarily executed in the condition that the execution thereof is not sure and that the validity/invalidity of the temporary execution thereof is determined corresponding to the determined branch condition, is referred to as a boosted instruction. The boosted instruction is detected corresponding to the analyzed result of the program and the collected result of statistic information by the compiler. The detected result affects execution code.
FIG. 4 is a list showing an instruction sequence including an instruction boosted from the taken side corresponding to the example 1 of the instruction sequence shown in FIG. 3. When an operation instruction "add r2,r5,r6" is boosted, it is executed as "add.b r2,r5,r6" before a branch instruction "br r2,0,L1". Thus, the number of nop instructions included in the instruction sequence to be executed can be decreased.
When the A stage of the operation instruction "add.b r2,r5,r6" is executed before the branch instruction "br r2,0,L1", the value to be written to the register r2 as the result of the execution is temporarily stored in the arithmetic and logic unit, but not written to the register r2. This value is written to the register r2 when the W stage of the operation instruction is executed. Thus, before the W stage of the operation instruction is executed, when the R stage of the branch instruction "br r2,0,L1" is executed, the content of the register r2 is not rewritten corresponding to the operation instruction to be executed after the branch instruction.
When the number of stages is large as in a super pipeline system, or when a plurality of operations are executed at a time as in a VLIW system or a super scalar processor, after the operated result becomes valid when a branch condition is determined, many instructions may be executable. Thus, the boosting technique works more effectively.
An instruction to be boosted is detected corresponding to the analyzed result of the program and the collected result of the statistical information by the compiler. As a result, the boosted instruction may be an instruction to be branched that just follows a branch instruction and that is executed when a branch condition of the branch instruction is not satisfied (namely, on the not-taken side) rather than when the branch condition of the branch instruction is satisfied (namely, on the taken side). For example, when the instruction sequence of the example 2 shown in FIG. 5 is executed, since an operation instruction "add r3,r5,r6" on the not-taken side is boosted, it is executed as an operation instruction "add.bn r3,r5,r6" before the branch instruction "br r2,0,L1" as shown in FIG. 6.
In addition, depending on the analyzed result by the compiler, an instruction on the taken side and another instruction on the not-taken side may be boosted at the same time. For example, when an instruction sequence shown in FIG. 7 is executed as example 3, since an operation instruction "add r3,r5,r6" on the not-taken side and an operation instruction "add r4,r7,r8" on the taken side are boosted, they are executed as operation instructions "add.bn r3,r5,r6" and "add.b r4,r7,r8" before the branch instruction "br r2,0,L1" as shown in FIG. 8.
In addition, corresponding to the analyzed result of the compiler, an instruction may be boosted after a plurality of branch instructions. For example, when an instruction sequence shown in FIG. 9 is executed as example 4, since an operation instruction "add r3,r5,r6" on the taken side corresponding to a first branch instruction "br r2,0,L1" is boosted, the operation instruction "add r3,r5,r6" is executed as an operation instruction "add.b r3,r5,r6" before the first branch instruction as shown in FIG. 10. In addition, when an operation instruction "add r4,r7,r8" on the taken side corresponding to a second branch instruction "br r3,0,L2" is boosted, the operation instruction "add r4,r7,r8" is executed as an operation instruction "add.b2 r4,r7,r8" before the second and first branch instructions.
However, in the conventional boosting method shown in FIGS. 4, 6, 8, and 10, to clearly represent the boosted instructions in the execution codes, instruction codes that represent boosted instructions other than conventional instruction codes are required.
For example, in the example shown in FIG. 4, the instruction code "add.b" is an instruction code boosted from the taken side corresponding to the conventional instruction code "add". In the example shown in FIG. 6, an instruction code "add.bn" that is boosted on the not-taken side is used. In the example shown in FIG. 8, both an instruction code "add.b" boosted from the taken side and an instruction code "add.bn" boosted from the not-taken side are used. In the example shown in FIG. 10, an instruction code "add.b2" boosted after a plurality of branch instructions is also used.
As described above, in the related art reference, instruction codes that represent boosted instructions other than conventional instruction codes are required. In addition, such instruction codes are required corresponding to types of conventional instruction codes. However, the types of instruction codes that are executable in the processor are limited by an instruction format defined by the processor. Thus, in the above-described conventional boosting method, instruction codes that represent boosted instructions may be not obtained. In particular, it is more difficult to obtain instruction codes that represent boosted instructions that maintain compatibility with conventional instruction codes. In contrast, to use more instruction codes that represent boosted instructions, the types of instruction codes should be reduced.