Sequential software, which runs on a single platform in a single thread, executes in a deterministic order. In other words, given the same input, the sequence of statement execution is fixed and unvarying in repeated executions of the software.
In parallel software applications, on the other hand, the order of execution of program statements may vary from one run to the next. In the context of the present patent application and in the claims, parallel software includes any sort of multi-threaded, concurrent, or distributed software, and may run on a single processor or on multiple processors. In parallel software, the sequence of statement execution is dependent, inter alia, on scheduler decisions, order of message arrival, synchronization mechanisms, and relative speed of hardware involved. Whereas in sequential software, the program output is uniquely determined by the inputs selected, in the case of parallel software, the outputs may depend not only on the input space of the program, but also on the order in which different tasks are performed. The set of information that describes a sequence in which a parallel program executes in a given execution run is called an interleaving. Faults in a parallel application, such as race conditions, may manifest themselves in one interleaving but not in others, making the task of debugging the application all the more difficult.
In response to this difficulty, testing tools for parallel applications have been developed that are based on adding “noise” to the application. The noise changes the timing of the application, in an attempt to expose timing bugs that arise when an implementation does not consider a specific interleaving in which a fault is manifested. For example, Edelstein et al. describe a tool of this sort, known as “ConTest,” for detecting synchronization faults, in an article entitled “Multithreaded Java Program Test Generation,” IBM Systems Journal 41:1 (2002), pages 111-125. A Java™ application program under test is seeded with a sleepy, yield( ) or priority( ) primitive at shared memory accesses and synchronization events. At run time, ConTest makes random or coverage-based decisions as to whether the seeded primitive is to be executed. The probability of finding concurrent faults is thus increased. A replay algorithm facilitates debugging by saving the order of shared memory accesses and synchronization events. A suitable replay algorithm for this purpose is described by Choi et al., in “Deterministic Replay of Java Multithreaded Applications,” Proceedings of the SIGMETRICS Symposium on Parallel and Distributed Tools (ACM, New York 1998), pages 48-59.