1. Field of the Invention
This invention relates to testing computer software. More particularly, this invention relates to testing software, which runs concurrently as multiple processes or threads, or on distributed processors.
2. Description of the Related Art
The main problem in testing a concurrent computer program, which executes as a plurality of threads or operates on a plurality of distributed platforms, is nondeterminism: two executions of such a program may yield different results. Most of the work in the field of concurrent testing has been focused on detecting race conditions. However, race conditions have a low probability of manifesting themselves, and even when they do, it is not always an indication of a fault. In any case, identifying race conditions is insufficient. It is possible that a program without races contains concurrent bugs, e.g., bugs due to incorrect usage of message-based synchronization.
Another approach to testing software is disclosed in the documents O. Edelstein, E. Farchi, Y. Nir, G. Ratsaby, and S. Ur., Multithreaded Java Program Test Generation, IBM Systems Journal, 41(1):111-125, 2002, and S. D. Stoller, Model-checking Multi-threaded Distributed Java Programs, in Proceedings of the 7th International SPIN Workshop on Model Checking of Software, pages 224-244, New York, 2000, Springer Verlag, which are herein incorporated by reference. The problem of generating different interleavings for the purpose of revealing concurrent faults was approached by seeding the program with conditional sleep statements at shared memory access and synchronization events. At run time, random, biased random, or coverage-based decisions were taken as to whether to execute seeded primitives. However, neither race detection nor the seeding approach helps detect bugs related to multi-layer memory models if the tests are executed on one-layer memory implementations.
Furthermore, the space of possible temporal orders of instruction executions by different threads that may be scheduled by a runtime environment, known as interleavings, is an exponential function of the program size. In the typical testing environment, little coverage of the space of possible interleaving is achieved. The term “coverage” concerns checking and showing that testing has been thorough. Coverage is any metric of completeness with respect to a test selection criterion for the program-under-test. The definition of coverage events is usually specific to a program-under-test, and constitutes a major part in the definition of the testing plan of the program.
The problem of testing multi-threaded programs is further compounded by the fact that tests that reveal a concurrent fault in the field or during stress testing are usually long and run under variable environmental conditions. For example, on a given machine, tasks launched asynchronously by the operating system may alter the machine's environment sufficiently to affect the results of two different executions of the same multi-threaded program. As a result, such tests are not necessarily repeatable. When a fault is detected, much effort must be invested in recreating the conditions under which it occurred.
In particular, the semantics of different versions of the Java™ two-layer memory model are a constant source of programmer misunderstandings and concurrent bugs. The Java memory model is described in the document, The Java Language Specification, James Gosling, Bill Joy, Guy Steele., Addison Wesley, 1996, and more recently in the document, JSR-133: Java Memory Model and Thread Specification, available on the Internet.
The memory model addresses the issue of heap synchronization. For various reasons, such as promoting efficient usage of multiprocessor machines, Java defines a two-layer model of the heap. Each thread operates on its own version of the heap, which in turn communicates with a global upper heap layer. The memory model defines the rules for this communication: when a thread executes certain operations, the executing environment must write the global heap onto the local one or vice versa. Another issue addressed by the memory model is instruction reordering. Many compiler optimizations are dependent on the ability of the compiler to reorder or duplicate instructions, issue prefetching requests, etc. However, a seemingly innocuous permutation at the thread level may change the program behavior because of interaction with other threads. Again, it is the responsibility of the memory model to define which permutations are allowed and which are not. The standard proposed in the above-noted JSR-133 model permits different kinds of heap synchronization and instruction reordering rules than its predecessor. Thus, programs that worked correctly under the old Java memory model may malfunction when run under JSR-133.
It is anticipated that the problems outlined above will become even more acute as new computer chips are equipped with two or more processors, and runtime implementations make use of the two-layer memory model.