Code verification and validation is the process of determining that a software program meets all specifications and fulfills its intended purpose. Code verification addresses the issue of the software program achieving its goals without any bugs or gaps. On the other hand, code validation ascertains whether or not the software meets high-level requirements and addresses the problem to be solved. Code verification ensures that “you built it right”. Code validation ensures that “you built the right thing”.
Static program analysis refers to analyzing computer software without actually executing the software. Static program analysis has been shown to be of great value in automating code verification tasks. Examples include functional verification tools such as Coverity™, as well as security analysis tools such as IBM Security AppScan Source Edition™ and HP Fortify 360™. One challenge faced by all tools based upon static program analysis is to achieve a proper balance between precision and scalability. Precision is achieved by building a granular albeit expensive analysis model. Scalability requires the opposite—a lightweight and less descriptive model.
A broad spectrum of parameters and models are available for determining the precision and scalability of the analysis tool. Some illustrative examples include a heap model, libraries and frameworks, virtual methods, path sensitivity, flow sensitivity, and reflection.
Heap Model: The heap model is based upon modeling dynamic memory allocation. This modeling process could lead to an unbounded number of runtime objects, and so the analysis must apply some type of finite approximation. The question of how to model runtime objects has received extensive treatment, resulting in many different heap modeling techniques. Some techniques are computationally cheap, such as Open Computer Forensics Architecture (OCFA). Others are computationally expensive, such as Three Valued Logic Analyzer (TVLA).
Libraries and Frameworks: The code to be considered in a given scope of analysis is primarily contributed by libraries and frameworks. For scalability, certain analyses model the effects of library calls conservatively via generic summaries. Other analyses that are more precise dive into the implementation to derive an accurate model of the library or framework's behavior.
Virtual Methods: Resolving virtual method calls is an undecidable problem in general. The difficulty relates to determining an exact identity of the object or objects flowing into the invocation site. Doing so precisely may involve abstraction refinement, demand-driven pointer analysis, on-demand interprocedural type inference, or another nontrivial technique. Alternatively, the analysis could make a conservative decision and resolve the invocation per all possible receiver abstractions.
Path Sensitivity: An important consideration is whether to explicitly model assertions following from branching along conditional and looping statements. Doing so undoubtedly enhances precision, but at the same time, a state space maintained by the analysis is split on every path condition. This can lead to a solution where a quantity is undefined or goes to infinity.
Flow Sensitivity: Another aspect is flow sensitivity which renders a decision as to whether or not to account for the order in which memory updates are performed. Naturally, flow insensitivity yields a much more scalable analysis compared to being flow sensitive.
Reflection: Similar to virtual calls, varying levels of precision may be provided in resolving reflective constructs, such as deciding a concrete identity of a type allocated via reflection. These processes involve both backward and forward traversal of a program's control structure to derive constraints on the behavior of a reflective statement (e.g., downstream downcasts).
The foregoing discussion indicates that the question of how to optimally balance between precision and scalability is naturally undecidable. It is impossible, in general, to compute a precise tradeoff between the two such that the analysis achieves the most accurate results without leading to a computer system crash. Thus, there exists a need to overcome at least one of the preceding deficiencies and limitations of the related art.