The goal of software verification is to ensure that error free software has been produced and is an essential part of software development. Software verification is generally accomplished with “Black-Box Testing.” Black-Box Testing consists of activating a software application with a set of inputs or test-cases, and comparing the output produced by the software application to the expected output from the software application. To ensure that a software application is error free, a set of possible test-cases are created for the software application and used to activate the software application under test with the test-cases. Large scale software applications tend to require a vast number of test-cases with an infinite number of permutations of inputs. However, it is often impractical for one to manually create a set of all possible inputs and permutations of inputs for the test-cases, and therefore, automatic verification is a preferred approach for testing large scale software applications.
A compiler is a computer program that translates a high level programming language, such as C++ or JAVA, into a lower level language, such as machine code, assembly, or an intermediate bytecode form that can run on a virtual machine. A compiler checks for errors in the input program files that are to be translated by the compiler. When a compiler discovers an error in the input program file that is being compiled, the compiler may indicate both the error and the occurrence of an error during the compilation. When a compiler successfully translates the input file, an executable is generated.
A compiler is verified by compiling a set of programs or test-cases with the compiler, and comparing the output from the compiler with the expected output of the compiler. Theoretically, a complete verification of a compiler requires a set of all possible test-cases. However, there are an infinite number of computer programs that can be compiled such that a complete verification of a compiler is unachievable. A practical approach for compiler verification is to create output producing test-cases that contain all words in the high level programming language and all the possible combinations of the words or statements in the language. The verification of all the words, elements, and statements, defined as a combination of elements in a programming language, of a programming language is referred to as checking the “syntax.” Verification of a compilers' ability to handle the syntax of a programming language alone is insufficient because the compiler must handle the semantic constraints of the programming language.
The semantics of a programming language takes in to consideration the context, the surrounding statements or environment for the statement or element in the program. For example, it is possible that an action or statement may be correct within a loop construct in the language but the same action or statement may be incorrect within a conditional “if” statement construct. Semantics are an important part of a high level programming language because the sentences or statements in the program are dependent upon each other. For example, in some high level programming languages, the statement “x=5;” cannot be written in the program without first declaring “int x;” in the program. Such semantic checks are called “Static Semantics.” In some programming languages, the statement “x=y/z” cannot be written when ‘z’ is equal to zero. These types of checks are “Dynamic Semantics” checks because the checks involve evaluation rules of the language.
Compiler verification is done by compiling and running test-cases with predicted run results and comparing them to the actual run results at the end. Compiler verification must contain both legal and illegal test-cases for complete verification of the ability for the compiler to handle the syntax and semantics of the high level programming language. Illegal test-cases may be input to the compiler in order to verify the error-handling ability of the compiler. Legal test-cases may be input to the compiler to test the compiler's ability to generate and executable. As a result, complete compiler verification may require tens of thousands of test-cases to cover every possible test-case.
An approach for compiler verification is the manual creation of program files and an expected output file corresponding to each program file. Approaches involving manual creation of test-cases may not handle all the relevant cases and make it difficult to handle updates of the programming language because the programs and expected output files must be manually altered. Obviously, the large amount of possible test-cases makes it impossible for human to cover all the scenarios.
Automation of compiler verification, involving the generation of test-cases, that is capable of programming meaningful test-cases with algorithms that mimic those created by a programmer of a high level language in order to test the semantics of the programming language requires an extremely high level of Artificial Intelligence (AI). The complexity involved in developing such an AI solution for compiler verification makes such an approach infeasible. Yet, testing a compiler without semantic checks on meaningful test cases does not allow for an accurate verification of the compiler. Thus, the ability to automate compiler verification would significantly increase the verification quality, while reducing the time it consumes.