This application is based on application No. H9-235144 filed in Japan, the content of which is hereby incorporated by reference.
1. Field of the Invention
The present invention relates to a program conversion apparatus for generating executable code for a VLIW processor by translating, linking, and editing a source program written in a high-level language and a recording medium. In particular, the invention relates to a technique for dividing instructions including constants in a source program into parts and executing parallel scheduling with the divided instructions.
2. Related Art
VLIW (Very Long Instruction Word) processors include a plurality of operation units which execute a plurality of operations arranged in each VLIW in parallel. VLIWs are generated by program conversion apparatuses, namely compilers, which detect parallelism in source programs at an operation level and perform scheduling of the source programs.
VLIWs are, however, fixed-length instructions and therefore are inefficient as code. That is, in many cases, it is necessary to insert redundant codes, such as no-operation codes (xe2x80x9cnopxe2x80x9d codes), into VLIWs. VLIW processors avoiding the occurrence of redundant areas in VLIWs are disclosed by Japanese Patent Applications H09-159058 and H9-159059 of the same applicant as this application.
Each of these VLIW processors includes a specialized constant buffer and a function for executing a program, in which a constant included in each instruction is extracted as it is or is extracted and is divided into several partial digits, and is arranged in different VLIWs. In this specification, the term xe2x80x9cdivided constantsxe2x80x9d describes these divided parts of a constant, or on occasion, entire constants. Each VLIW processor executes this program by accumulating divided constants in the constant buffer (in a digit direction) to reconstruct the original constant and using the reconstructed original constant as a branch destination or an operand. Note that a VLIW processor having this function is hereinafter referred to as a xe2x80x9cconstant reconstructing VLIW processorxe2x80x9d. A compiler for the constant reconstructing VLIW processor divides long constants in a program into divided constants and fills redundant areas in instructions with the divided constants, thereby improving the code efficiency of the program.
However, a compiler has not yet been proposed which is suitable for the constant reconstructing VLIW processor.
This compiler needs to divide long constants in a program into divided constants and to appropriately arrange the divided constants in a plurality of VLIWs. By doing so, the compiler generates executable code. This reduces redundant areas in instructions. This function needs to ensure that each original constant is correctly reconstructed from the divided constants arranged in the plurality of VLIWs and is definitely used by the intended instruction.
In view of the stated problems, the object of the present invention is to provide a compiler used for constant reconstructing VLIW processors and to provide executable code suitable for the constant reconstructing VLIW processors.
To achieve the above object, the compiler of the present invention converts an instruction sequence composed of serially arranged instructions into a VLIW sequence for a processor. The compiler includes: a division step for dividing each instruction including a constant in the instruction sequence into a plurality of divided instructions; an analysis step for analyzing dependence relations between each instruction in the instruction sequence including divided instructions generated in the division step according to an execution order of each instruction in the instruction sequence; and a relocation step for relocating instructions in the instruction sequence in compliance with the analyzed dependence relations to generate VLIWs which are each composed of a plurality of instructions that are executable in parallel.
With the stated steps, each instruction including a constant in a source program is divided into at least two shorter instructions and parallel scheduling is performed using the shorter instructions so that a compiler suitable for the constant reconstructing VLIW processor can be realized. That is, the generation of redundant areas in VLIWs is suppressed.
Here, the division step may include: an instruction size judgement substep for performing an instruction size judgement as to whether a size of an instruction including a constant is equal to or smaller than a size of each unit operation field in a VLIW; and a division substep which, when the size of the instruction including the constant is judged to be greater than the size of each unit operation field, divides the instruction including the constant into a plurality of divided instructions whose sizes are each equal to or smaller than the size of each unit operation field.
With the stated steps, only instructions whose sizes are greater than operation fields of object VLIWs are divided and are subjected to the parallel scheduling. Therefore, even when a source program includes instructions whose sizes are irrelevant to operation fields of object VLIWs, the division process is performed only on instructions which should be divided, reducing the compiling time.
Here, in the division substep, the instruction including the constant may be divided into one or more instructions for storing the constant into a storage buffer of the processor and an instruction for using the stored constant.
With the stated process, all constants in instructions are stored in a constant buffer. As a result, instructions including constants do not need to include the constants as operands so that a compiler suitable for VLIWs having small operation fields for specifying only operation codes can be realized.
Here, in the division substep, the instruction including the constant may be divided into one or more instructions for respectively storing one or more divided constants into the storage buffer of the processor and an instruction for using the stored divided constants, where the divided constants are obtained by dividing the constant.
With the stated process, only divided constants exceeding the size of constant areas in object VLIWs are pre-stored in the constant buffer and the following instructions use the divided constants in the constant buffer. As a result, a compiler suitable for VLIWs having operation fields for specifying short operands can be realized.
Here, the compiler may further include a combination step which, when two or more divided instructions generated from a same instruction including a constant in the division substep are arranged in a same VLIW in the relocation step, combines the two or more divided instructions into one instruction.
With the stated step, inconvenience situations can be precluded where an instruction which should remain as a single instruction (an instruction which should not be divided) are divided into two or more instructions, arranged in different operation fields of a VLIW, and are executed, so that the execution speed is reduced. Also, the combination of divided constant set instructions and inappropriate divided constant use instructions can be prevented.
Here, in the instruction size judgement substep, when the final size has not been determined, the instruction size judgement may be performed using an assumed size for the constant. The compiler may further include: a constant size determination step for linking a plurality of VLIW sequences and determining a final size of each constant; and an insertion step which, when the final size is greater than the assumed size, generates an instruction for storing into the storage buffer a divided constant corresponding to a difference between the final size and the assumed size and inserting the generated instruction into a corresponding VLIW sequence.
With the stated steps, inconsistency during the division and link processes due to label sizes which have not been determined during compiling and assembling can be avoided. Therefore, a compiler suitable for program development which links object modules generated in a plurality of compile units can be realized.
Here, in the instruction size judgement substep, when the final size has not been determined, the assumed size may be set to the maximum address size or constant size manageable by the processor or to the most commonly used address size or constant size.
With the stated process, inconstancy due to the assumed sizes can be avoided so that the generation of VLIWs including no-operation codes can be suppressed.
Here, the compiler may re-execute the division step after the constant size determination step, where in the instruction size judgement substep in the re-executed division step, the instruction size judgement is performed in consideration of the final size determined in the constant size determination step.
With the stated process, during the division of a constant, the final label size is taken into account so that the instruction insertion does not need to be performed and executable code where the code size and execution time are reduced can be generated.
Here, the compiler may re-execute the analysis step and the relocation step following the re-executed division step.
With the stated process, each constant is divided appropriately and the optimization by the parallel scheduling is repeated, so that executable code of higher code efficiency can be generated.
Here, the executable code of the present invention is a VLIW sequence for a processor which executes a plurality of instructions in parallel, where a VLIW in the VLIW sequence includes a constant to be stored into a storage buffer of the processor implicitly indicated by at least one VLIW in the VLIW sequence, and another VLIW, which follows the VLIW and is the first to refer to the storage buffer after the VLIW, includes an instruction for using the constant in the storage buffer.
In the stated code, each constant and each instruction using a constant are respectively divided into at least two shorter constants and instructions, are arranged in VLIWs, and are scheduled to be reconstructed by the constant reconstructing processor. Therefore, executable code suitable for the constant reconstructing VLIW processor, namely executable code of high code efficiency where the redundant areas in VLIWs are suppressed, can be provided.