1. TECHNICAL FIELD
The invention generally relates to the field of automated software testing and in particular to parallel symbolic execution of a target program using a cluster of commodity hardware.
2. Background Information
Symbolic execution is an automated technique for testing software programs. Instead of executing a target program with regular concrete inputs (e.g., x=5), symbolic execution executes a target program with “symbolic” inputs that can take on all values allowed by the type (e.g., x=λ, where λεN, and N is the set of all numbers). Whenever a conditional branch is encountered that involves a predicate π that depends (directly or indirectly) on x, state and execution are forked into two alternatives: one following the then-branch (π) and another following the else-branch (π). The two executions can now be pursued independently. This approach is efficient because it analyzes code for entire classes of inputs rather than specific (“concrete”) inputs. When a failure point (e.g., a “bug”) is found, a test generator can compute concrete values for target program inputs that take the program to the bug location. A “symbolic test” specifies families of inputs and environment behaviors for which to test a target program. By encompassing entire families of behaviors, symbolic tests cover substantially more test cases than “concrete” regular tests. Also, a symbolic test enables environment conditions to be reproduced which otherwise would have been very difficult or impossible to set up with regular test cases.
In order to test a target program using symbolic execution, a symbolic execution engine (SEE) executes the target program with unconstrained symbolic inputs. When an execution branch involves symbolic values, execution forks into two separate executions, each with a corresponding clone of the program state. Symbolic values in the clones are constrained to make the branch condition (e.g., λ<MAX) evaluate to false (e.g., λ≧MAX) or true (e.g., λ<MAX). Execution recursively splits into sub-executions at each subsequent branch, turning an otherwise linear execution into an execution tree. FIG. 13A is a listing of pseudocode illustrating an example of a target program. FIG. 13B is a symbolic execution tree corresponding to the listing of pseudocode in FIG. 13A.
In this way, all execution paths in the target program are explored. To ensure that only feasible paths are explored, the SEE uses a constraint solver to check the satisfiability of each branch's predicate, and the SEE follows only satisfiable branches. If a bug is encountered (e.g., a crash or a hang) along one of the paths, then the solution to the constraints accumulated along that path yields the inputs that take the target program to the bug. These inputs constitute a test case.
One of the challenges faced by symbolic testing is scalability. The phenomenon of “path explosion” refers to the fact that the number of paths through a program is roughly exponential in program size. Since the size of an execution tree is exponential in the number of branches, and the complexity of constraints increases as the tree deepens, state-of-the-art SEEs can quickly bottleneck on limited computing resources (e.g., central processing unit (CPU) cycles and memory), even for target programs that have only a few thousand lines of code (KLOC). Path explosion severely limits the extent to which large software programs can be thoroughly tested. One must be content with either a low percentage of code coverage for large programs or using symbolic execution tools with only small programs.