1. Field of the Invention
The present invention relates to a processor that executes a plurality of instructions in parallel and to a program conversion apparatus for the same.
2. Description of the Related Art
In recent years, VLIW (Very Long Instruction Word) processors have been developed with the aim of achieving high-speed processing. These processors use long-word instructions composed of a plurality of instructions to execute a number of instructions in parallel.
Japanese Laid-Open Patent No. 5-11979 discloses an example of this kind of technique. FIG. 1 is a block diagram of a processor disclosed in this document.
The processor of FIG. 1 includes a register file 1, an external memory 2, an instruction register 3 having four instruction slots, an input switching circuit 4, a transfer unit 5, a integer calculation unit 6, a transfer unit 7, an integer calculation unit 8, an integer calculation unit 9, a floating-point unit 101 a branch unit 11, an output switching circuit 12 and a register file or external memory 13.
The instruction register 3 stores four instructions, which make up one long-word instruction, in its four internal instruction slots (hereafter referred to as ‘slots’). Here, the instruction in each of the first and second slots is either an integer calculating instruction or a data transfer instruction (also referred to as a load/store instruction). The instruction in the third slot is a floating-point calculating instruction or an integer calculating instruction and that in the fourth slot is a branch instruction. The arrangement of instructions in one long-word instruction is performed in advance by a compiler.
The transfer unit 5 and the integer calculation unit 6 are aligned with the first slot, and execute the data transfer and integer calculating instructions respectively.
The transfer unit 7 and the integer calculation unit 8 are aligned with the second slot, and execute the data transfer and integer calculating instructions respectively.
The integer calculation unit 9 and the floating-point unit 10 are aligned with the third slot, and execute the integer calculation and floating-point instructions respectively.
The branch unit 11 is aligned with the fourth slot and executes branch instructions.
Here, the transfer units 5 and 7, the integer calculation units 6, 8 and 9, the floating-point unit 10 and the branch unit 11 are generally referred to as functional units.
The input switching circuit 4 inputs source data read from the register file 1 or the external memory 2 into the required functional units.
The output switching circuit 12 outputs the results of calculations by the utilized functional units to the register file or external memory 13.
A processor constructed as above decodes and executes instructions stored in the four slots in parallel. Assume, for example, that an ‘add’ instruction for adding register data is stored in the first slot. The processor inputs two pieces of register data from the register file 1 into the integer calculation unit 6 via the input switching circuit 4. The two pieces of register data are then added by the integer calculation unit 6 and the result stored in the register file 13 via the output switching circuit 12. Instructions in the second, third and fourth slots are also decoded and executed in parallel with this instruction.
However, in this kind of conventional processor certain functional units are left idling when instructions are executed. When an integer calculating instruction is executed by the third slot, for example, the floating-point unit is left idling.