Program slicing may determine a set of program statements (e.g., individual machine instructions) that affect the correctness of a specified statement, called the slicing criterion, within a program. A slice may consist of the statements upon which the slicing criterion is dependent upon. For example, a program slice with respect to machine instruction S may include a set or subset of machine instructions upon which the instruction S is dependent for correct live-in input values. The slice may or may not pertain to instructions included in a particular code region. Such a region may include, for example, a static representation of a frequently executed code segment.
Slicing has advantages. For example, in program debugging an examination of a program slice may allow one to spend time focusing on highly relevant program statements rather than having to examine the entire program. As another example, in program parallelizing slicing helps identify independent dependence chains that can be executed in parallel.
Static slices and dynamic slices are two types of program slices. Dynamic slicing may consider each dynamic invocation of an instruction at runtime to be a different slicing criterion and therefore compute a dynamic slice set that is unique to that instruction at a specific point in the execution. In contrast, a static slice may include a single slice set for an instruction which represents all dependencies that the instruction could have.
Slices may be context-insensitive or context-sensitive. To perform context-insensitive slicing, a static backward slice with respect to a given instruction may be constructed as follows. A program dependence graph (PDG) is created based on static single assignment (SSA) data dependence information and the computed control dependence information. A backwards graph reachability analysis may then be performed on the PDG starting at the criterion instruction. However, context-insensitive slicing may ignore function calling context (e.g., return edges, call edges, and/or fallthrough pseudo edges, all of which are described below) and may provide slice sets that are too large, much larger than an equivalent context-sensitive slice. The large slice size may adversely affect automatic parallelization efforts.
Context-sensitive slicing may also be problematic. Such slicing may require computing context-sensitive slices by computing the PDG and then modifying it by adding annotations which result in a system dependence graph (SDG). A modified two-phase graph reachability analysis may then be used on the SDG to compute context-sensitive slices. However, the SDG approach may be difficult to apply to binary programs, as opposed to source code, because the approach may assume the target program is structured and that the overall program call graph and underlying parameter passing methodology is known. A structured program may be a program that has a well defined structure, such that executable statements are located within program procedures and procedures have known call and return semantics. Binary programs and binary streams often violate this assumption of structure. For example, a binary program may have overlapping procedures that cause statements to belong to more than one procedure at a time, as well as having complex control flow that may be unrelated to source-level abstractions.
FIG. 1(a) shows a sample program source code, where function “square” is called twice in statements 2 and 5. The related PDG is shown in FIG. 1(b). With context insensitive slicing, the slice for statement 3 includes 1, 2, 3, 4, 5, 7, 8 and the slice for statement 6 includes 1, 2, 4, 5, 6, 7, 8. However, statement 3 should not necessarily depend on 4 and 5 since 4 and 5 are executed after 3. Statement 6 should not necessarily depend on 1 and 2 since the values computed by 1 and 2 are rendered less relevant or irrelevant by 4 before statement 6 is reached. This example illustrates varies shortcomings of conventional slicing techniques.