Symbolic execution (SE) for program verification has been adopted as a popular verification technique, e.g., in the JavaPathFinder tool, and forms the basis of systematic dynamic testing techniques. SE's program analysis method proceeds by initializing all program inputs to unknown symbolic values and then propagates these values through the program control flow graph nodes by program exploration to check if an error location is reachable. Symbolic execution is a popular technique for bug-finding, providing program proofs as well as for systematic dynamic testing.
In SE, the program inputs are initialized to unknown symbolic values, and then propagated along program paths with the help of decision procedures. The main idea is to assign unknown symbolic values to input program variables and then propagate these values along paths of the program control flow graph using, e.g., depth-first-search (DFS). As compared to other verification techniques, SE combines the power of explicit and symbolic techniques in a unique manner for program verification: the technique can explore large program depths (which may be a bottleneck for symbolic techniques like bounded model checking), while symbolically exploring all possible inputs simultaneously (as opposed to explicit state techniques which enumerate the inputs). The symbolic state is represented as a pair (C, σ), where C is the path condition denoting the conjunction of all guards occurring in the current path, and σ is a mapping from program variables to their symbolic values (terms). A SE engine relies on two main components: a term substitution computation for evaluating program expressions and updating symbolic values due to assignments, and, a decision procedure, e.g., an SMT solver, to check if the current symbolic values can be propagated into a conditional statement.
Two main bottlenecks arise when applying symbolic execution to verify large programs. Firstly, since the algorithm enumerates all program paths iteratively, there may be an exponential number of paths to be explored (known as the path explosion problem), e.g., due to a sequence of conditional statements or function calls. Secondly, the terms representing the symbolic values of program variables eventually blow-up after several substitution operations. Moreover, symbolic execution of loops may lead to deep execution paths, which may cause further blow-up. Although modern incremental SMT solvers, e.g., are able to handle such blow-up for path conditions to a certain extent by using specialized algorithms, deep program exploration reduces their performance significantly. Moreover, they do not help simplifying the state representation in any way.
Several approaches have been proposed to perform forward symbolic execution and backward weakest preconditions effectively. Expression renaming was proposed to avoid blowup during weakest precondition computation by using SSA representation. Although SSA representation assists SMT solvers to a certain extent, it does not allow semantic simplifications of symbolic values that SSA variables may assume. By preserving term structure in the symbolic state being propagated, the instant technique is able to perform term simplifications using rewriting, before flattening terms to clauses. These simplifications are hard to obtain at the clausal level inside an SMT solver, as is demonstrated by the instant experiments. Moreover, calls to SMT solver are obviated in many cases due to rewriting. Arons et al. and Calysto reuse SAT query results by caching and structural term analysis. Simplification and caching are complementary optimizations: simplification can reduce SAT query times even when caching fails and vice-versa.
In one approach, an equational axiomatization of the ite theory (the set of valid ite equations) was first provided by McCarthy and later by Bloom and Tindell and others. Sethi provided a more semantic algorithm (to overcome the locality of syntactic transformations) for simplifying ite-terms with equality predicates using two basic transformations: implied and useless tests. Nelson and Oppen proposed a method for simplification of formula, involving rules for ite-simplification based on McCarthy's axiomatization. Burch and Dill used a variant of Sethi's implied test to check satisfiability of ite-terms over the logic of equality and uninterpreted symbols (EUF) using a specialized case-splitting algorithm. Burch proposed an improvement in terms of a simplification algorithm for ite-terms, again based on the implied test. Similar rewrite rules have been proposed to extend BDD-like representations to the EUF logic and the logic of difference constraints, e.g., a rewrite system for normalizing decision diagrams with nodes containing equality over variables.