As computer programs increase in size and complexity, there is typically a corresponding increase in the number of syntactic errors introduced into the source code of these programs. Additionally, the increased size and complexity of today's computer programs make detection and isolation of these syntactic errors a much more difficult task.
The IEEE standard definition of an error is a mistake made by a developer. An error may lead to one or more software mutations (also known as faults). Mutations are located in the source code of a computer program. A mutation is a difference between an incorrect program and a corresponding correct program. The mutation may be localized in one statement or may be textually dispersed into several locations in the computer program. Similarly, the mutation may be repairable in many ways, with each one leading to a correct, but different program. See: Offutt, A. and Hayes J., A Semantic Model of Program Faults, Proceedings of the 1996 International Symposium on Software Testing and Analysis, May, 1996.
The above definition of a mutation refers to the syntactic nature of a mutation. If the mutation is being inserted into the computer program, then the syntactic nature of the mutation is described by corresponding changes to the computer program. If the mutation occurs naturally in the program, then the syntactic nature of the mutation is described by the number of changes needed to correct the program. Examples of syntactic characterizations of mutations include using an incorrect variable name, or checking to see if a called function fails. Such mutations are often caused by programmer's mistakes, such as typographical errors.
A mutation can also be characterized semantically. Each computer program P can be viewed as having a specification S that defines sets D (an input domain) and R (an output range), and a mapping from D to R ##EQU1##
A semantic characterization of a mutation views the faulty computer program as containing a computation that produces incorrect output over some subset of the input domain. That is, the mapping of inputs to outputs ##EQU2##
is incorrect ##EQU3##
for some subset of D.
The characterization of mutation as "semantic" and "syntactic" proves quite useful when considering the size of a mutation. For a syntactically small mutation, one token or one statement may be incorrect. For a semantically small mutation, P's behavior on a very small subset of D is incorrect. A mutation that is syntactically small can result in a mutation that is very large semantically, because, the syntactic mutation can affect arbitrarily many inputs. Also, a major syntactic mutation in P may affect only a few inputs, resulting in a small semantic mutation. Finally, there are cases where a small semantic mutation can be modeled as small syntactic mutations, and small syntactic mutations can result in small semantic mutations.
There are significant behavioral differences between syntactically small/semantically large mutations and syntactically small/semantically small mutations. Syntactically small/semantically large mutations are of little value during the functional testing and verification process. This is evidenced by the fact that syntactically small/semantically large mutations are readily detected by almost any test case that reaches the mutated statement. These syntactically small/semantically large mutations are also subject to a high degree of overlap. That is, a testing/verification test case that kills one syntactically small/semantically large mutation will almost always kill many other syntactically small/semantically large mutations. Conversely, the subtle, syntactically small/semantically small mutations are much harder to detect during normal functional testing and verification. Thus, detection of these subtle, syntactically small/semantically small mutations will lead to higher quality tests.
Prior art software functional testing and verification systems have focused on mutations that are small syntactically, without consideration to semantic size. As mentioned in the previous paragraph, mutations that are small syntactically but are large semantically are easily detected by a very simple set of test vectors; they add difficulty to the functional testing and verification process without increasing the testing value of the resulting regression test cases. Syntactically small faults (mutants) typically have a large semantic size and, consequently fail to increase the value of resulting test cases.
In view of the above, there is a need for a system for detecting and discarding syntactically small faults having a large semantic size, and for integrating this concept into a functional simulation and verification system to determine a likelihood that undetected functional bugs exist in a software program and if a test suite properly tests the computer program.