Testing is the dominant method for finding bugs in computer software and hardware. When combined with methods to measure the amount of coverage achieved, it is also the dominant method for assessing when the software or hardware concerned is good enough for release. Testing to high coverage is enormously expensive. For example, more than half the development costs in avionics systems are spent on verification and validation activities, and testing is a substantial part of verification and validation. In hardware and software companies, more than half the entire technical staff may be devoted to testing.
Performing tests and evaluating test outcomes can be automated to a considerable degree, but generating test cases still is a largely a time consuming manual process. The quality and coverage of the tests generated is utterly dependent on the skill and diligence of those performing the task. Coverage is a measure of how thoroughly a system has been tested. Coverage can be defined with respect to the structure of the system under test (SUT) (e.g., requiring that every control point or every branch in the software is visited by at least one test), with respect to the structure of the model or design from which the SUT was developed, or with respect to the properties that the SUT is expected to satisfy (e.g., those properties documented in its requirements specification).
Current attempts to develop automatic test case generation involve describing the target of each test by means of a property (e.g., “reach control point X in the SUT”), then solving the constraint satisfaction problem to find inputs to the SUT that will drive it through an execution that satisfies the property concerned. A popular way to solve the constraint satisfaction problem is by means of a model checker: the model checker is asked to check the negation of the property concerned (e.g., “the SUT never reaches control point X”) in some representation of the SUT or its design or specification, and will produce a counterexample (e.g., a trace of state transitions in SUT that reaches control point X from some initial state) that is equivalent to the desired test case. Guided by the coverage desired, different test targets are identified and separate tests are generated for each one. FIG. 1 illustrates a generally understood representation of test generation for a SUT. Because each test is generated separately, each of them restarts the SUT (which can make the test expensive to perform), and the set of tests generated by this approach contains much redundancy (e.g., many tests start the same way). This is inefficient, both in generating tests, and in executing them. Furthermore, the model checker or other method may be unable to solve the constraint satisfaction problems for targets whose tests require many steps from an initial state.
A variant on this approach to automatic test case generation overcomes some of the limitations of model checking and constraint satisfaction, but stops short of addressing the need to generate irredundant test sets. (See Beyer et al., Generating Tests from Counterexamples. In 26th International Conference on Software Engineering, Edinburgh, Scotland, May 2004; IEEE Computer Society).
Yet another approach advocates building an abstract model and doing a so-called “Chinese postman's tour” thereby generating a big, sweeping test case and an efficient test set. (See Grieskamp et al., Generating finite state machines from abstract state machines. In International Symposium on Software Testing and Analysis (ISSTA), pages 112-122, Association for Computing Machinery, Rome, Italy, July 2002). Restricted to explicit-state model checking, these tour-based approaches are unsuited to achieving coverage goals (e.g., MC/DC; See K. Hayhurst, D. Veerhusen, J. Chilenski, and L. Rierson. A Practical Tutorial on Modified Condition/Decision Coverage. NASA Technical Memorandum TM-2001-210876, NASA Langley Research Center, Hampton, Va., May 2001.) of the kind used in avionics and other critical embedded systems and are suitable only for validation of consumer products.
What is needed is an efficient method for the automated generation of test cases that achieves high coverage with a minimal number of tests. What is also needed is a method for automated test generation providing rapid generation of tests and providing a high level of coverage within the time and memory budget available.