1. Field of the Invention
The present invention relates to an optimization apparatus for optimizing instruction sequences that have been converted into machine language or assembler language, and to a compiler for converting a source program in high-level language into an instruction sequence written in machine language or assembler language.
2. Description of the Prior Art
Optimization at an intermediate code level is performed after writing a source program and converting this into intermediate code. By performing optimization, the code size and/or execution time of the final program can be suitably improved. However, regardless of whether the program generated after optimization is composed of assembler instruction sequences (hereinafter simply called an "assembler program") or machine language sequences (hereinafter simply called a "machine language program"), such programs often include redundant code or instruction sequences that cause execution delays.
When improvements to code size and/or execution time are strongly desired, optimization processes such as instruction scheduling, the deletion of redundant instructions, and copy propagation are performed on the assembler code or machine code generated by a compiler.
Optimization at assembler language level or machine language level is achieved by instruction scheduling or by deleting redundant transfer instructions using equivalence groups. Note that the following explanation focuses on the case where optimization is performed at assembler language level.
Instruction Scheduling
The following is a description of instruction scheduling as a first conventional example of optimization at assembler language level.
In recent years, pipeline architecture has been increasingly used in microprocessors to speed up processing. To achieve the full potential of pipeline architecture, the pipeline needs to be continuously filled with instructions.
Depending on the structure of the pipeline, different instruction sequences can produce gaps in the pipeline. As one example, for a 5-stage pipeline single scalar machine whose a pipeline is composed of an IF (instruction fetch), a DEC (instruction decode), EX (execute), MEM (memory operation), and WB (register write) stages, it is not possible for an instruction to refer to a value that has just been loaded from the memory (hereinafter referred to as "load-refer sequence"). When instructions are arranged in this order, a gap will appear in the pipeline, causing a delay. To avoid the generation of such delays, instruction scheduling needs to be performed for this machine to separate the load-refer sequence. In a compiler whose target is a pipeline-architecture machine, an optimization process called instruction scheduling is performed to separate the load-refer sequence and so allow the pipeline architecture to be used to its full potential.
Instruction scheduling is a process of reediting the arrangement of instructions to suit a pipeline architecture. The arrangement of instructions especially refers to the relations between a given instruction and its preceding and succeeding instructions, so that reediting involves the interchanging of certain pairs of instructions within a program.
Scheduling may be performed in two different ways, first by considering the pipeline structure of the target machine to avoid pipeline hazards, and secondly by efficiently supplying instructions to a parallel conversion unit. Since the degree to which the pipeline can be filled depends on the order in which instructions are supplied, the full potential of the pipeline may be realized by rearranging the order of the instructions.
It should be noted here that the interchanging of instructions needs to be performed very carefully. Should instructions be simply interchanged without regard for the consequences, there is a real risk of a breakdown in the algorithm of the program. To avoid this, instructions in the program need to be classified into those which cannot be interchanged (hereinafter "inviolable") and those which can.
Inviolable instructions are a pair of instructions that cannot be interchanged. To establish which pairs of instructions are inviolable, instructions that cannot be interchanged are detected and directed links are established between them.
Definition-Reference links, reference-definition links, and definition-definition links are patterns of the directed links that are conventionally formed between inviolable instructions. These are described in more detail below.
Definition-reference Links
Definition-reference links are directed links which show that the order of an instruction defining a resource and a later instruction referring to the resource is inviolable. One example is the following pair of instructions.
(1) mov 100,D0 PA1 (2) add D0,D1 PA1 (1) mov 100,D0 PA1 (2) add D0,D1 PA1 (3) mov 200,D0 PA1 (4) add D0,D1 PA1 (1) mov 100,D0 PA1 (2) mov 200,D0 PA1 (3) add D0,D2 PA1 (1) a=10 PA1 (2) p=&a PA1 (3) *p=20
In the above instruction sequence, the data flow is dependent on the register D0. As a result, the interchanging of instructions will result in the breakdown of the data flow. Accordingly, when instruction scheduling is performed, directed links clearly show the inviolable relation between the instruction that defines the resource and the instruction that refers to it.
Reference-definition Links
Reference-definition links are directed links that show the inviolability of the relation between an instruction that refers to a resource and an instruction that redefines the resource. The following is an example instruction sequence that will be used to explain why reference-definition links also need to be examined when rearranging the instructions.
In the above instruction sequence, the data flow in instructions (1)-(2) is dependent on the register D0. The data flow in instructions (3)-(4) is similarly dependent on the register D0. Suppose here that the instruction sequence is rearranged into the order (1)-(3)-(2)-(4). In this order, the definition-reference order is maintained as described above, although if the machine language program is executed in this state, 200 will be added to the value in the register D1, changing the meaning of the machine language program. Accordingly, the dependence on the register D0 in instructions (2)-(3) is preserved as a reference-definition link, so that a clear indication of the inviolability of these instructions is given.
Definition-definition Links
Definition-definition links are directed links that show the inviolability of the order of an instruction that defines a given resource and another instruction that redefines the resource. The following is an example instruction sequence that will be used to explain why definition-definition links also need to be examined when rearranging the instructions.
In the above instruction sequence, the data flow in instructions (2)-(3) is dependent on the register D0. As a result, a definition-reference link is set between instructions (2) and (3). In this example, instruction (1) is also a definition of register D0. Supposing here that the instruction sequence is rearranged to become (2)-(1)-(3), the execution of the rearranged instruction sequence will result in 100 being added to register D2, which changes the meaning of the machine language program. To avoid such erroneous rearranging of the program, the dependence on the register D0 in instructions (1)-(2) is preserved as a definition-definition link in the dependence graph.
The following is an explanation of conventional instruction scheduling by way of an example program. The construction of a conventional compiler is shown in FIG. 5. The following example deals with the case when processing the program shown in FIG. 1A. The program is first inputted into the analyzing unit 81, is analyzed, and is then converted into intermediate code. The intermediate code at this stage is shown in FIG. 1B. Next, the resource assigning unit 82 assigns the variables in the intermediate code to registers or memory. In this example, the variable i is assigned to the register D0, while the variable k is assigned to the memory address (SP,0). Based on this assigning, the assembler instruction generation unit 84 then generates the assembler program shown in FIG. 1C. As shown in FIG. 1C, the load instruction "mov (SP,0),D1" which loads a value from memory is directly followed by the instruction "add D1,D1" which refers to the loaded value. As a result, this sequence will result in a delay (load-refer). This sequence is next given to the instruction scheduling unit 85. This instruction scheduling unit 85 is composed of the dependence graph generation unit 86 and the instruction rearranging unit 87. Assembler instructions that are given to the instruction scheduling unit 85 are first inputted into the dependence graph generation unit 86 which generates a dependence graph corresponding to the inputted assembler instructions. The dependence graph shows the resource dependency between instructions and so defines the execution order of instructions. When the two instructions A and B are shown as being joined "A.fwdarw.B" in the dependency graph, this means that the instruction A needs to be executed before instruction B. The dependency graph generated by the dependence graph generation unit 86 in the present example is shown in FIG. 1D.
The position where a delay is caused in FIG. 1D is shown by the cross. On completing the dependency graph, the dependence graph generation unit 86 inputs it into the instruction rearranging unit 87. The instruction rearranging unit 87 then heuristically rearranges the instructions in the program to make the best possible use of the pipeline of the target machine, while not violating the dependency graph. The assembler language program that has been rearranged by the instruction rearranging unit 87 is shown in FIG. 1E. In comparison with the program shown in FIG. 1C, the program in FIG. 1E has the instruction "add 1,D0" located between the load and reference instructions, with the separation of the load and reference instructions in FIG. 1E acting to prevent the generation of the delay (as shown by the circle in FIG. 1E). The code composed of these rearranged instructions is then inputted into the code output unit 88. The code output unit 88 outputs a file containing the inputted instructions as a machine language or assembler language program.
Removal of Redundant Transfer Instructions
The following is an explanation of the deletion of redundant transfer instructions as the second conventional example of optimization at assembler program level.
The expression "redundant transfer instructions" here refers to the transfer instructions that go to the trouble of transferring a value even though equivalency is already established between the resources involved in the transfer.
An "equivalent relation" shows that a resource indicated as the destination of a transfer instruction has the same stored value as a resource indicated as the source of a transfer instruction once the transfer instruction is executed.
The equivalent relations which are valid for each instruction are expressed using equivalence groups. An equivalence group is a group of resources that exhibit an equivalent relation with each other. More specifically, these groups are expressed using register names and addressing codes that specify access addresses in memory.
FIGS. 2A and 2B show an optimization process which uses equivalence groups. Here, FIG. 2B shows the equivalence groups that are present just before the execution of each instruction in the example program shown in FIG. 2A.
As shown in FIG. 2B, the equivalence group {(SP,4),D1} is established just before the execution of the instruction on the second line of the example program. This means that the stored value of the register D1 is equal to the value al the memory address (SP,4). Meanwhile, the equivalence groups {(SP,4),D1} and {3,D0} are established just before the execution of the instruction on the fifth line, showing that the stored value of the register D1 is equal to the value at the memory address (SP,4) and that the stored value of the register D0 is equal to the immediate 3.
Of particular note in FIG. 2B is that an equivalent relation is established between the stored value of the register D1 and the memory address (SP,4) after the execution of the instruction on the fourth line and before the execution of the instruction on the fifth line. In spite of this, the instruction on the fifth line is a transfer instruction transfers the value at the memory address (SP,4) into the register D1. Accordingly, this transfer from the memory address (SP,4) into the register D1 is redundant, and so can be deleted. The result of this deletion is shown in FIG. 2C.
The conventional optimization methods performed at assembler language level or machine language level have however been subject to many restrictions due to the presence of definition instructions that use indirect addressing.
A first restriction with conventional instruction scheduling is that the movement of an instruction across a definition or reference instruction that uses indirect addressing is prohibited, thereby restricting the freedom with which instruction scheduling can be performed. The reason such movement is prohibited is explained below. Wit a definition instruction that uses indirect addressing, the memory address in which a value should be written cannot be clearly ascertained from the code. If a memory access instruction is positioned before or after an instruction which uses indirect addressing, there is the possibility that the indirect addressing instruction and the other memory access instruction will access the same memory address. Even if the probability of this actually happening is small, optimization of instructions that involves moving instructions across indirect addressing instructions should be completely avoided.
FIG. 3A shows an example of an instruction sequence before instruction sequencing is performed. In FIG. 3A, the instruction "mov D0,(A0)" on the second line is a memory access instruction that defines a value at a memory address that is indicated through indirect addressing (such instructions also being known as "memory definition instructions"). The memory address affected by this instruction is determined from the stored value in the address register A0. However, it is impossible to determine what value is stored in this address register A0 from the example program shown in FIG. 3A. When it is unclear into what memory address a value should be written by an indirect addressing definition instruction, all memory access instructions starting from an indirect addressing definition instruction need to be interpreted as having an inviolable relationship with this indirect addressing definition instruction.
In the example program of FIG. 3A, the instructions o the third, fourth, sixth and seventh lines all access the stack region of the memory, and since the instruction on the second line is an indirect addressing definition instruction, there is the possibility that this definition instruction will access the same memory address as one of these following instructions. As a result, the indirect addressing definition instruction on the second line is interpreted as having an inviolable relationship with the instructions on the third, fourth, sixth, and seventh lines.
FIG. 3B shows an example dependency graph. In this dependency graph, directed links are established between instructions where there is an inviolable relation. These directed links are formed between the second and third lines, the second and fourth lines, the second and sixth lines, and the second and seventh lines. If, in this way, an indirect addressing definition instruction has directed links with as many as four instructions, this represents a great restriction to the freedom with which the instructions can be rearranged. In the instruction sequence shown in FIG. 3A, even though a hazard is present between the instructions on the fourth and fifth lines, the directed links shown in FIG. 3B show that the arrangement of instructions cannot be freely adjusted, preventing the removal of the hazard.
A second problem relates to the deletion of redundant transfer instructions using equivalent relations. Since equivalence groups are destroyed before and after definition instructions that use indirect addressing, there are cases when it is not possible to delete redundant instructions present in the program.
FIG. 4B shows the result of optimization of the program example shown in FIG. 4A when analyzing equivalent relations. In FIG. 4B, equivalence groups are destroyed by the fifth line due to the presence of the indirect addressing definition instruction on the fourth line. In the indirect addressing definition instruction "mov D1,(A0)", the memory address (A0) is determined as the address indicated by the address register A0. However, it is impossible to determine what value is stored in this address register A0 from the example program shown in FIG. 4A. When it is unclear into what memory address a value should be written by an indirect addressing definition instruction, all memory resources in the equivalence groups preceding the indirect addressing memory access instruction need to be removed.
In the present example, a transfer instruction that transfers a value from the memory address (SP,4) to the data register D1 in present on the fifth line. Since the equivalence group that includes the address (SP,4) and the data register D1 is destroyed because of the indirect addressing definition instruction on the fourth line, this redundant transfer instruction on the fifth line cannot be deleted.