The present invention concerns compiler optimization and pertains particularly to the use of random test generation to test code generated by an optimizing compilation system.
Programs are generally written in a high level programming language. This high level language, often referred to as source code, is translated by a compiler program into an assembly language. The binary form of the assembly language, called object code, is the form of the code actually executed by a computer. The object code is generally first produced in object code modules. If there is more than one object code module, the object code modules are linked together by a linker. For the purpose of the present application, the term "compile" includes both the process of producing the object code modules and linking the object code modules together.
In order to increase performance of object code when executed by a target computer, compiled code is frequently optimized. As compiler optimizations have been becoming more complex and aggressive, testing these optimizations has become increasingly difficult.
Today's state-of-the-art compilers perform a variety of optimizations to improve the performance of compiled code. See, for example, Aho, A. V., Sethi, R. and Ullman, J. D., Compiler Principles, Techniques and Tools, Addison-Wesley (1986). Because of industry pressure for better performing code and faster compilation, there has been a steady growth in the number and complexity of optimizations used by compilers. For example, to avoid the phase ordering problem, multiple optimizations are merged together or a single optimization is implemented in several phases. Optimizations are using more sophisticated analyses, such as register pressure estimates, to identify when an optimization is beneficial. To cut down on compile time, complex data structures are reused across optimizations. Incremental updates are performed to keep this data up to date. Each of these increases in compiler complexity impose an additional risk of defects in the compiler.
Quality is paramount for a production compiler. Bugs in the optimizer often result in incorrectly running compiled code, which is very difficult for an end-user to debug. Also, users of compiler systems respond to compiler bugs by either turning off optimization altogether or moving to a different compiler or platform.
To ensure quality in a production compiler, extensive software design and testing processes are strictly adhered to, including design documents, code reviews, black-box testing, white-box testing, and integration testing. See for example, Beizer, B, Software Testing Techniques (2nd edition); Van Nostrand Reinhold (1990), Boujarwah, A and Saleh, K, "Compiler Test Suite: Evaluation and Use in an Automated Test Environment", Information and Software Technology, Vol. 36, No. 10, (1994), pp. 607-614; and Myers, G, The Art of Software Testing, John Wiley (1979). Despite all this, significant numbers of defects still occur in optimization. Many of these defects are due to subtle interactions between different optimizations or from overlooked corner cases. Writing an exhaustive set of tests that covers all possible cases and interactions between optimizations is impossible given the infinite number of legal source programs as well as the combinatorial number of possible interactions between optimizations.
Automatic test generation is used extensively for testing hardware. Automatic test generators have also been developed to test various kinds of software. For a review of automatic test generation for compilers, see Burgess, C. J., "The Automated Generation of Test Cases for Compilers," Software Testing, Verification, and Reliability, Vol. 4, No. 2, (1994), pp. 81-94. Most of the earliest work on automatic test generation concentrated on testing the parser and code generator of compilers. See, Bazzichi, F., and Spadafora, I., "An Automatic Generator for Compiler Testing", IEEE Transactions on Software Engineering, Vol. SE-8, No. 4, (1982), pp. 343-353; Celentano, A., Crespi-Reghizzo, S., Della-Vigna, P., Ghezzi, C., Granata, G., and Savoretti, F., "Automatic Generation of Test Cases," Software Practice and Experience, Vol. 10, (1980), pp. 897-918; Hanford, K. V., "Automatic Generation of Test Cases," IBM Systems Journal, Vol. 9, No. 4, (1970), pp. 242-257; and, Purdom, P., "A Sentence Generator for Testing Parsers," BIT, Vol. 12, (1972), pp. 366-375. Such test generators produce random source code from a formal grammar of the source language. The compiled code generated from these random source programs cannot be executed.
Bird, D. L., and Munoz, C. U., "Automatic Generation of Random Self-Checking Test Cases", IBM Systems Journal, Vol. 22, (1983), pp. 229-245 disclose some of the earliest work performed in field of automatic generation of executable random tests for compilers. This test generator consists of a loop that randomly selects the next kind of statement to be generated, then invokes a subroutine to produce that statement. Self-checks are inserted into the randomly generated program to ensure that it is producing correct results. Because of this, the test generator must predict the values produced by executing the program. This required restrictions to be made on the control flow behavior of the program. More specifically, iterations of a loop cannot use values produced by other iterations.
A language-independent automatic generator of executable programs was disclosed in Burgess, C. J., "Towards the Automatic Generation of Executable Programs to Test a Pascal Compiler," Barnes, D. and Brown, P. (eds), Software Engineering '86, IEEE Computing Series Vol. 6, Peter Peregrinus Ltd., (1986), pp. 304-316. This test generator is driven by a grammar annotated with synthesized and inherited attributes to ensure semantic correctness. The generator predicts the correct values produced by their generated programs and inserts self checks to ensure correctness. To allow the correct prediction of values in a program, all loops are constrained to execute only one iteration.
In Burgess, C. J., and Saidi, M., "The automatic generation of test cases for optimizing Fortran compilers", Information and Software Technology Vol. 38 No. 2, (1996), pp. 111-119, a previous test generator was extended to specifically test optimizations. The enhanced test generator delays performing self-check tests on a variable as long as possible, since these self-checks interfere with optimization. This paper also discloses a modified test generator which explicitly produces common subexpressions and induction variables in a randomly generated program.
Seaman, R. P, "Testing high level language compilers", IEEE Computer System and Technology Conference, (1974), pp. 6-14, discloses an algorithm which compiles a randomly generated program with two different compilers, executes both versions of the program and compares the results. If the results are different, there must be a bug in one of the compilers.