1. Technical Field
The present invention relates to computer program analysis tools and more particularly to a system and method for more efficiently analyzing computer programs by property driven pruning to reduce program checking overhead.
2. Description of the Related Art
Concurrent programs are notoriously hard to debug because they often include a large number of possible interleavings of thread executions. Concurrency bugs often arise in rare situations that are hard to anticipate and handle by standard testing techniques. One representative type of bugs in concurrent programs is a data race, which happens when multiple threads access a shared data variable simultaneously and at least one of the accesses is a write.
To completely verify a multi-threaded program for a given test input, one has to inspect all possible thread interleavings under that input. For deterministic programs, the only source of non-determinism comes from a thread scheduler of the operating system. In conventional testing environments, the user does not have full control over the scheduling of threads; running the same test multiple times does not necessarily translate into better coverage. Static analysis has been used for detecting data races in multi-threaded programs, both for a given test input and for all possible inputs. However, a race condition reported by static analysis may be bogus; even if it is real, there is often little information for the user to reproduce the failure.
Model checking has the advantage of exhaustive coverage which means all possible thread interleavings will be explored. However, model checkers require building finite-state or pushdown automata models of the software to be analyzed. The model checkers often do not perform well in the presence of heap allocated data structures.
Dynamic search can systematically explore the state space without explicitly storing the intermediate states. These techniques are often directly applied to handle software programs written in full-fledged programming languages such as C. For detecting data races, these methods are sound (no bogus races) due to their concrete execution of the program itself (as opposed to a model). Given a test input, an algorithm systematically executes the program under controlled thread schedules, to obtain concrete execution traces. If a data race is encountered during an execution, it is a real race condition (although it may be benign). Otherwise, the algorithm backtracks in the depth-first search order to a previous context switch and generates another execution trace. If all feasible thread interleavings have been explored without encountering any race condition, the program is race-free for the given input.
Although stateless dynamic model checking is sound (in that it does not report bogus bugs), the pruning of search space can be inefficient due to the lack of property specific backtracking. Note that the number of thread interleavings of a concurrent program can be astronomically large. Although partial order reduction can help remove redundant interleavings from the same equivalence class, provided that the representative interleaving has been explored, it is not a target driven pruning. Without a conservative approximation or warranty typical of static analysis, stateless model checking needs to enumerate the entire set of equivalence classes. However, as far as race detection is concerned, many equivalence classes themselves may be redundant and therefore should be pruned away.
FIG. 1 shows a motivating example. There are two concurrently running threads T1 and T2, both of which access the global variables x, y and z. All access to global variables are protected by the two locks f1 and f2. The data race occurs when the two threads access variable y simultaneously at lines a6 and b10. Since x=y=0 initially, the race condition may occur only when line b4 is executed before any transitions of T1. If (c=0) holds, a6 and b10 may be simultaneously reachable. In a stateless model checking run, the first execution trace, often arbitrarily chosen, may be a1, . . . ,a11, b1, . . . ,b11. Since a10 and b3 have a read-write conflict on variable a, according to dynamic partial order reduction (DPOR) algorithms, a backtracking point is added immediately before line a9 to make sure that the alternative trace a1, . . . ,a8,b1,b2,b3, . . . is explored in future search. This is reasonable, and is true for any partial order reduction method, since the above two executions are not in the same equivalence class (in Mazurkiewicz's trace theory, as is known).
However, it is clear that backtracking to a9 and then searching the subspace is futile since line a6 can never be reached simultaneously with b10 in this set of alternative execution traces. This can be revealed by a simple lockset analysis, together with the fact that all the other shared variable accesses are protected by some common locks. In such cases, ideally we would like to have a property specific search space pruning to skip a9 and backtrack directly to a2.