1. Field of the Invention
Recent improvements in processing capacity of computer systems are mainly realized by improvement in processing capacity of processors, Central Processing Units (CPUs). The processing capacity of the processors is improved by increasing Instruction Level Parallelism (ILP), for example. Technologies such as Super-Scalar Architecture and Very Long Instruction Word Architecture (VLIW) are known as a method to increase the ILP.
Various microprocessors are provided in electronic devices such as cellular phones, printers, and digital televisions. Those devices are called as Embedded Application device, and those microprocessors are called as Embedded Processors.
Recent demands for high-performance embedded application devices require performance of embedded processors to be improved year by year. Some of the embedded processors have been improving their performance by increasing ILP.
In a development of an embedded application device, however, both cost and power consumption of the device must be considered at a high priority. A processor designed by the super scalar architecture, compared with a processor designed by VLIW architecture, usually requires larger chip size and consumes more electric power. On contrary, the processor based on VLIW requires a larger program, and consequently a larger memory device to store the program, since “no operation” instructions (NOP) must be inserted into instruction slots where no instruction is put.
Processors designed by variable length VLIW architecture have solved this problem as described in the specification filed as a Japanese patent application No. 1999-281957 dated Oct. 1, 1999.
The present invention generally relates to computer programs, and more particularly, to algorithm to verify an arrangement of basic VLIW instructions for language processing system used on such processor designed by variable length VLIW architecture.
2. Description of the Related Art
FIG. 1 shows a configuration of a conventional processor based on the very long instruction word architecture. This processor will be referred to as VLIW processor.
(Architecture)
The conventional processor shown in FIG. 1 includes a memory 10, an instruction read unit 11, instruction registers 12, integer units IU0 and IU1, floating units FU0 and FU1, branch units BU0 and BU1, a general purpose register GR, a floating register FR, and program counter PC.
The instruction read unit 11 reads a memory area storing a VLIW instruction addressed by an address stored in the program counter PC, and write the VLIW instruction to the instruction register 12. The instruction read unit 11 also increase the address stored in program counter PC by a number corresponding to a VLIW instruction.
The instruction register 12 stores the VLIW instruction written by the instruction read unit 11. The instruction register 12 provides the instruction to IU, FU, and BU as follows:
A basic instruction stored in an instruction slot 0 is provided to IU0. Basic instructions stored in an instruction slot 1, 2, 3, 4, and 5 are provided to FU0, IU1, FU2, BU0, and BU1, respectively.
IU0 and IU1 perform an integer arithmetic instruction, an integer load instruction, an integer store instruction, a floating point load instruction, a floating point store instruction, and a “no operation” instruction.
When an integer arithmetic instruction is provided, the integer units retrieve input operand data from the general purpose register GR, and store output operand data, the result of the integer arithmetic, to the general purpose register GR.
When an integer load instruction is provided, the integer units IU0 and IU1 retrieve input operand data from a register, and calculate an effective address. Then, the integer units retrieve data from a memory area corresponding to the effective address, and store the data to the general purpose register GR.
When an integer store instruction is provided, the integer units retrieve input operand data from the general purpose register GR, and calculate an effective address. Then, the integer units store “store data” to a memory area corresponding to the effective address.
When a floating point instruction is provided, the integer units retrieve input operand data from a register, and calculate an effective address. Then, the integer units retrieve data stored in a memory area corresponding to the effective address, and store the data to the floating register FR.
When a floating point store instruction is provided, the integer units retrieve input operand data from the floating register FR, and calculate an effective address. Then, the integer units store “store data” to a memory area corresponding to the effective address.
When a “no operation” instruction is provided, the integer units perform nothing.
The floating units FU0 and FU1 perform a floating point arithmetic instruction and a “no operation” instruction. When a floating point arithmetic instruction is provided, the floating units retrieve input operand data from a floating register FR, and perform floating point arithmetic. Then, the floating units store output operand data, a result of the arithmetic, to a floating register FR. When a “no operation” instruction is provided, the floating units perform nothing.
The branch units BU0 and BU1 perform an unconditional branch instruction, a conditional branch instruction, and a “no operation” instruction. When an unconditional branch instruction is provided, the branch units retrieve input operand data from registers (GR, PC), and calculate an address follow d by storing the address to a program counter PC. When a conditional branch instruction is provided, the branch units check whether a branch condition is met. If the branch condition is met, the branch units retrieve input operand data from a register (GR, PC), and calculate an address using the input operand data. The branch units further store the result, i.e., an address of a destination of the branch, in the program counter PC. When a “no operation” instruction is provided, the branch units perform nothing.
IU, FU, and BU are, hereinafter, called functional units. A functional unit performs a basic instruction provided by an instruction register.
(Operation)
Operations of a VLIW processor will be described here.
A process in which a VLIW processor shown in FIG. 1 executes a program shown in FIG. 2, for example, will be described with reference to FIG. 3. In these figures, “ADD” is an integer arithmetic instruction meaning an addition, “FADD” is a floating point arithmetic instruction meaning an addition, and “NOP” is a “no operation” instruction.
(Time 1)
(A) A VLIW instruction 1 is stored in memory area in the memory 10 as shown in FIG. 2. Using an instruction address stored in PC, the instruction read unit 11 retrieves a VLIW instruction 1 from the memory 10, and stores the VLIW instruction 1 to the instruction register. Basic instructions included in the VLIW instruction 1 are stored in the instruction slots indicated as Time 1 as shown in FIG. 3.
(B) The functional units execute the instructions provided. An “ADD” instruction stored in the instruction slot 0 is executed by IU0. A “FADD” instruction stored in the instruction slot 1 is executed by FU0. An “ADD” instruction stored in the instruction slot 2 is executed by IU1. A “FADD” instruction stored in the instruction slot 3 is executed by FU1. A “NOP” instruction stored in the instruction slot 4 is executed by BU0. Another “NOP” instruction stored in the instruction slot 5 is executed by BU1.
The execution of the VLIW instruction 1 finishes when a last basic instruction is executed by an instruction unit.
(Time 2)
(A) A VLIW instruction 2 is stored in memory area in the memory 10 as shown in FIG. 2. Using an instruction address stored in PC, the instruction read unit 11 retrieves the VLIW instruction 2 from the memory 10, and stores the VLIW instruction 2 to the instruction register. Basic instructions included in the VLIW instruction 2 are stored in the instruction slots indicated as Time 2 as shown in FIG. 3.
(B) The functional units execute the instructions provided. An “ADD” instruction stored in the instruction slot 0 is executed by IU0. A “NOP” instruction stored in the instruction slot 1 is executed by FU0. A “NOP” instruction stored in the instruction slot 2 is executed by IU1. A “NOP” instruction stored in the instruction slot 3 is executed by FU1. A “NOP” instruction stored in the instruction slot 4 is executed by BU0. Another “NOP” instruction stored in the instruction slot 5 is executed by BU1.
The execution of the VLIW instruction 2 finishes when a last basic instruction is executed by an instruction unit.
(Time 3)
(A) A VLIW instruction 3 is stored in memory area in the memory 10 as shown in FIG. 2. Using an instruction address stored in PC, the instruction read unit 11 retrieves the VLIW instruction 3 from the memory 10, and stores the VLIW instruction 3 to the instruction register. Basic instructions included in the VLIW instruction 3 are stored in the instruction slots indicated as Time 3 as shown in FIG. 3.
(B) The functional units execute the instructions provided. A “NOP” instruction stored in the instruction slot 0 is executed by IU0. A “FADD” instruction stored in the instruction slot 1 is executed by FU0. A “NOP” instruction stored in the instruction slot 2 is executed by IU1. A “NOP” instruction stored in the instruction slot 3 is executed by FU1. A “NOP” instruction stored in the instruction slot 4 is executed by BU0. Another “NOP” instruction stored in the instruction slot 5 is executed by BU1.
The end of the execution of the VLIW instruction 3 means that all basic instructions are executed by instruction units.
In case of a VLIW processor, an instruction slot in the instruction register 12 where a VLIW instruction retrieved by the instruction readout unit 11 is stored and a functional unit in the instruction execution unit which executes the VLIW instruction corresponds 1-to-1. In other words, since an integer arithmetic instruction, an integer load instruction, an integer store instruction, a floating point load instruction, and a floating point store instruction are executed only by the integer units IU0 and IU1, these instructions must be stored in either the instruction slot 0 or the instruction slot 1.
Since a floating point arithmetic instruction is executed only by FU0 or FU1, this instruction must be stored in either the instruction slot 1 or the instruction slot 3.
Because a conditional branch instruction and an unconditional branch instruction are executed only by the branch units BU0 or BU1, these instructions must be stored in the instruction slot 4 or the instruction slot 5. Due to this constraint, a language processing system for a VLIW processor must verify the correspondence between a basic instruction and an instruction slot. A language processing system, an assembler and a compiler, for a VLIW processor includes a VLIW verification step which verifies whether an arrangement of basic instructions is executable by the VLIW processor. Only executable VLIW instructions are stored in the memory 10.
(Assembler)
FIG. 4 is a flow chart of an assembler for a VLIW processor as an example of prior art. The assembler includes a word analysis step S11, an instruction code generation step S12, a VLIW verification step S13, and an object generation step S14.
In the word analysis step S11, source code text is retrieved, from the beginning sequentially, from a source code file of an assembler program, and words and phrases in the retrieved source code text are analyzed. In the instruction code generation step S12, analyzed words and phrases are converted into instruction codes. In the VLIW verification step S13, it is verified whether a VLIW instruction can be provided through an instruction issuance unit to an instruction execution unit of the processor. In the object generation step S14, issuable VLIW instructions are converted into an object format, and written out to an object program file.
FIG. 5 is a flow chart of the VLIW verification step S13. The VLIW verification step S13 includes an instruction slot configuration verification step S13-1 and a register conflict verification step S13-2.
The instruction slot configuration verification step S13-1 verifies whether each basic instruction of a VLIW instruction is assigned to an instruction slot which can execute the basic instruction. FIG. 6 is a flow chart of the instruction slot arrangement verification step S13-1.
The register conflict verification step S13-2 verifies whether two or more basic instructions of a VLIW instruction store data in the same register at the same time. An algorithm used in the instruction slot configuration verification step S13-1 which verifies whether basic instructions of a VLIW instruction are issuable is as follows.
In the step S22, basic instructions are taken out from the VLIW instruction first. In the next step S23, an instruction slot at which a basic instruction is assigned is identified. In the next step S24, an instruction slot at which the basic instruction is executable is checked with reference to an assignable instruction slot table. In the step S25, whether the instruction slot at which the basic instruction is assigned (S23) is one of the instruction slots at which the basic instruction is executable (S24) is checked. The steps S22-S27 are repeated until all instruction slots are checked (Step S21).
FIG. 7 is the assignable instruction slot table which is referred to at the step S24. The assignable instruction slot table indicates, for each basic instruction available for a VLIW processor, which instruction slot is assignable and which is not.
(Compiler)
FIG. 8 is a flow chart of a compiler for a VLIW processor as an example of prior art. As shown in the flow chart, the compiler includes a word analysis step S31, a syntax analysis step S32, a semantic analysis step S33, a VLIW formation step S34, and an assembly language description output step S35.
The word analysis step S31 reads out source code text, from the beginning sequentially, out of a source code file written in a high level language, and analyzes words and phrases of the source code text. The syntax analysis step S32 analyzes a logical structure of the program in accordance with syntax rules. The semantic analysis step S33 analyzes the meaning of each component of the program, and converts the source code to an intermediate language codes. The VLIW formation step S34 converts the intermediate language codes into a VLIW instruction, and is identical to a VLIW verification step S13 of the assembler. The assembly language output step S35 outputs the VLIW instructions expressed in the assembly language.
FIG. 9 is a flow chart of the VLIW formation step S34 of the compiler. The VLIW formation step 34 uses the following algorithm. The step S41 checks whether a basic instruction can be taken out of an intermediate language expression. If YES, a step S42 follows, and if NO, a step S48 is performed. In the step S42, a basic instruction is taken out. A step S43 checks whether the basic instruction can be assigned to an instruction assignment table. If YES, a step S45 follows, and if NO, a step S46 is performed.
The step S45 assigns the basic instruction to the instruction assignment table, and a step S42 follows. If the step S44 is NO, the step S46 outputs a set of basic instructions stored in the instruction assignment table as a VLIW instruction. A step S47 clears the instruction assignment table. Then, the step 43 follows.
If the step 41 is NO, a set of basic instructions stored in the instruction assignment table is output as a VLIW instruction.
In case of an embedded processor based on a variable length very long instruction word architecture described in a Japanese patent application 1999-281957 dated Oct. 1, 1999, instruction slots, which is an element of a VLIW instruction, and functional units have either a 1-to-many relationship or a many-to-many relationship. Accordingly, a language processing system must verify whether a set of basic instructions forming a VLIW instruction is executable by the processor.
Since embedded processors can be used in a wide range of applications, performance requirements for an embedded processor vary in a wide range. The variable length VLIW architecture processors described in the Japanese patent application mentioned above realizes processors for which different length instructions can be used, and satisfies such requirements. Short instruction length processors are applicable to low performance applications, and long instruction length processors are applicable to high performance applications. It should be noted, however, that making a different language processing system which supports a processor having a different instruction length is not economical.