Exhaustive testing of a commercially written program code that has thousands of lines of codes and multitudes of execution paths is very challenging, particularly when there is a large number of processes executing concurrently.
Complex programs are commonly tested by way of non-exhaustive testing methods, such as simulation-based verification. Simulation-based verification techniques are generally based on a heuristic approach which is efficient in determining most errors in typical programs, but does not guarantee discovery of all errors (i.e., bugs). Simulation-based verification is traditionally used to verify the correctness of a program with respect to a set of input values. Different input values can induce a different execution path for the program under test. Moreover, in concurrent programs, the same set of input values can induce several different execution paths, due to scheduling decisions.
Execution of a program in a certain path is considered to be correct, if the program behaves as required (i.e., produces the expected output) for all possible input values for that path. Ideally, a complete verification will test all execution paths in a program for all possible input values for each path, and for all possible scheduling decisions. Obviously, this approach is very time consuming and expensive, if not impractical.
A more practical testing option focuses on discovering bugs that manifest themselves when the computing system is subjected to a heavy computational load during runtime (i.e., stress-testing). Another practical approach involves creating random interference in an event scheduler which leads to randomized interleavings in execution paths.
The interference is created by injecting noise to the scheduler forcing the scheduler to create uncommon interleavings, which are rare during usual execution. However, a chance of finding a very rare bug or a defective program routine that is only very rarely initiated is small, since the random distribution created by injecting noise cannot be adjusted to a specific pattern.
The most commonly used method to maximize the exhaustiveness of testing is coverage estimation. There is an extensive simulation-based verification research on coverage metrics, which provides a measure for determining the exhaustiveness of a test. Coverage metrics are used in order to monitor progress of the verification process, estimate whether more input sequences are needed, and whether direct simulation towards unexplored areas of the program are required.
In the related art systems, the metrics measure the part of the design that has been activated by the input sequences. For example, code-based coverage metrics measure the number of code lines executed during the simulation. There are a variety of different metrics used in coverage estimation, and they play an important role in the design validation effort. No useful method exists in the prior art that is directed to replaying or recreating a program execution that manifested a possibly erroneous pattern in the code. Thus, rare bugs can remain undetected even after testing has reached a high degree of coverage.
Accordingly, testing and verification methods and systems are needed that can overcome the aforementioned shortcomings by directing code testing to replay of program executions that enhance identification of bugs, especially those of the rare event type.