Compilers are programs that translate computer programs written in source languages, such as FORTRAN, Pascal, C, and C++, into equivalent compiled programs consisting of assembly-language instructions or machine-code instructions. In order to be useful and commercially acceptable, a compiler needs to correctly translate source programs into compiled programs and needs to generate compact, space-efficient compiled programs that execute efficiently. A correct translation is a translation that produces a compiled program that is semantically equivalent to the source-language program from which the compiled program is translated. When a correctly compiled program is executed, the correctly compiled program operates exactly as defined by the source-language program for any possible input. There are a virtually limitless number of possible semantically correct translations for a given source-language program that range in size from some minimum number of assembly-language or machine-code instructions up to the largest number of assembly-language and machine-code instructions that can be practically stored within a computer system. It is desirable, for many reasons, for a compiler to select, from among the virtually limitless number of possible translations, a compiled program having a size, in instructions, reasonably close to the minimum possible size. When executed with some defined input, the execution times exhibited by the virtually limitless number of possible translations will range from some minimum execution time up to some theoretical execution time that may exceed the practical lifetime of a computer system. It is desirable for a compiler to select a translation that, over the range of inputs practically expected for the program, exhibits execution times reasonably close to the minimum possible execution time for the program.
The three above-described characteristics--correctness, space efficiency, and execution efficiency--are intimately related. It is often the case that a program that correctly executes for the majority of inputs, but that incorrectly executes on certain special, infrequently-expected inputs, may execute considerably faster, in general, than a completely correct program. By omitting the instructions and conditional branches required to detect and handle the infrequently expected inputs, it may be possible to produce an incorrect program that has fewer instructions along the critical control paths within the program. A larger, semantically correct program may execute faster than a smaller, semantically correct program. For example, space-efficient programs often contain loop structures in which a certain instruction or group of instructions is repeatedly executed while a loop control variable is incremented or decremented on each iteration. If, instead, the instruction or instructions within a loop are explicitly copied within the program, the incrementing and decrementing of the loop control variable may be avoided, thus decreasing the number of instructions executed and correspondingly decreasing the execution time for the larger program. However, larger, space-inefficient programs may incur considerable and often more than offsetting inefficiencies in storage and copying of instructions, within and between different internal components of the computer system, during execution.
In general, a compiler first correctly translates a source-language program into an intermediate program in a step called intermediate code generation. Then, a compiler invokes a wide array of different optimization techniques and strategies in order to produce a compiled program that retains the correctness of the intermediate code program, but that is both space-efficient and efficient in execution. Optimization techniques can detect and remove unnecessary instructions, can schedule groups of instructions for simultaneous execution on machines that support concurrent execution of more than one instruction, can transform less efficient algorithms to more efficient algorithms, and can rearrange the sequence of execution of instructions in order to minimize the overall execution time of the program. In some cases, rearranging the execution sequence may produce a program that is not semantically equivalent to the intermediate code program, but that is more efficient, in general, in execution. In such cases, additional instructions may be introduced in order to detect and correct the errors introduced by changing the sequence of instructions.
Two types of resequencing optimizations that may introduce run-time errors have attracted theoretical interest. The first type of resequencing optimzation is known as control speculation. Control speculation results from moving an instruction whose execution depends on a preceding conditional branch instruction to a position in the program preceding the conditional branch instruction. The relocated instruction may be executed in cases in which, had the relocated instruction not been moved, the instruction would not have been executed. Under certain conditions, to be discussed below, control speculation may result in run-time exceptions that do not occur during the execution of the non-optimized, semantically correct program.
The second type of instruction relocation is known as data speculation. Data speculation occurs when an instruction that loads a value into a register from memory and that occurs after an instruction that stores a value into memory is moved to a position in a program preceding the store instruction. When both the load and store instructions operate on the same memory location, the load instruction may load a different value in the optimized program than that loaded by the instruction in the non-optimized program as a result of executing prior to the store instruction.
A detailed discussion of the theory and implementation of compilers can be found in a number of textbooks, including Compilers: Principals, Techniques, and Tools, Aho, Sethi, and Ullman, Addison-Wesley Publishing Company, 1988 and Advanced Compiler Design & Implementation, Muchnick, Morgan Kaufmann Publishers, 1997. Various aspects of control and data speculation are described in the following references: (1) "Sentinel Scheduling with Recovery Blocks," D. I. August, B. L. Deitrich, and S. A. Mahlke, Tech Rep. CRHC-95-05, Center for Reliable and High-Performance Computing, University of Illinois, Urbana, Ill., February 1995; (2) "Speculative Execution Exception Recovery Using Write-Back Suppression," R. A. Bringmann, S. A. Mahlke, R. E. Hank, J. C. Gyllenhaal, and W. W. Hwu, Proc. of the 26.sup.th Annual Int'l Symp. on Microarchitecture, December 1993; (3) "Three Architectural Models for Compiler-controlled Speculative Execution," P. Chang, N. Waters, S. A. Mahlke, W. Y. Chen, and W. M. Hwu, IEEE Trans. on Computers, Vol. 44, No. 4, pp. 481-494, April 1995; (4) "Data Preload for Superscalar and VLIW Processors," W. Y. Chen, Ph.D. Thesis, University of Illinois, Urbana, Ill., 1993; (5) "HPL PlayDoh Architecture Specification: Version 1.0," V. Kathail, M. Schlansker, B. Rau, Hewlett-Packard Laboratories Technical Report, HPL-93-80, February 1994; (6) "Exploiting Instruction Level Parallelism in the Presence of Conditional Branches," S. A. Mahlke, Ph.D. Thesis, University of Illinois, Urbana, Ill., 1996; and (7) "Sentinel Scheduling for VLIW and Superscalar Processors," S. A. Mahlke, W. Y. Chen, W. W. Hwu, B. R. Rau, and M. S. Schlansker, Proc. of the 5.sup.th Annual Int'l Conf. on Architectural Support for Programming Languages and Operating Systems, pp. 238-247, 1992.
Although control and data speculation has been recognized as separate optimization techniques, a framework for incorporating both control and data speculation together in a single technique within a compiler has been lacking. Instead, control speculation has been treated separately from data speculation. This separate treatment results in separate optimization phases within the compiler, leading both to inefficiencies in compilation and to potentially incorrect execution of the optimized program is special cases where control and data speculation directly interfere with one another. Furthermore, although opportunities for control and data speculation are currently identifiable, a technique for handling potentially incorrect behavior of optimized programs that include control and data speculation have been unsatisfactory. For these reasons, a need has been recognized in the area of compiler optimization for a framework for incorporating control and data speculation together in a unified manner as an optimization strategy.