The present disclosure relates to software program verification.
In computing, formal verification and testing are often used for system verification, such as model-based development of automotive software, to determine that a given design is defect-free. FIG. 1A is a block diagram of an example automotive feedback control system which uses lookup tables (LUTs), which may be populated using experimental data. In this example, the controller 102 can use LUTs as control code to allow for simplified changes to control behavior in different environments. The monitor 104 can use LUTs to model physical components when components are difficult to model accurately from physical principles. This can help to attain a desired effect when checked by the monitor in block 106.
Theoretically, formal verification can addresses an infinite number of scenarios because there may be infinitely many possible inputs. As an example, FIG. 1B is a block diagram illustrating an example system 110 receiving inputs x1-xn, and outputting outputs y1-yn. In this example, the verification goal is to prove that when the inputs satisfy a certain assumption, the outputs satisfy a certain guarantee, as illustrated by the following formula:assume(x1, . . . ,xn)⇒guarantee(y1, . . . ,ym).
However, computing the outcomes of an infinite number, or even a huge number of, possible inputs xn is not practical, which makes formal verification impractical to scale to meet the requirements of complex systems, such as automotive software. This can be particularly true for industrially deployed software, such as automotive software. In automotive software, a source of complexity is LUTs, of which automotive software in particular often makes heavy use, and which typically have numerous entries which are used for making control decisions and/or as models of physical processes. LUTs must be proven case by case. Cascaded LUTs can increase the number of proof cases exponentially.
For instance, in one case study based on a real automotive software component, there were 1050 proof cases. If each proof case could be resolved in 0.01 seconds on a cluster with one million cores, the total proof would take on the order of 1034 years to complete, which is clearly impractical in real world scenarios. On the other hand, automotive software is often complex and safety critical, so ensuring the software is bug free and that the different components of the software work safely and correctly may be essential. The scalability challenge associated with LUTs in formal verification of automotive software is often regarded as one of the most difficult unsolved problems.
As a further example, FIG. 1C depicts an example simple system model 114. In this example, ideally, when the input is between [0, 4], the output should never not be a number (NaN). Formal verification can theoretically attempt to prove mathematically that the denominator (x−3.26598) is never zero over [0, 4] by identifying that, a certain x, y is undefined. However, since there are an infinite number of inputs possible, determining this result with perfect resolution or even limited resolution is impractical, particularly in real world applications, as it would require too much time.
Thus, while formal verification provides a theoretical mathematical proof for establishing whether the system behaves correctly under all possible inputs, it is computationally expensive and impractical to scale for most scenarios that involve numerous inputs as described above.
In contrast, testing shows that the system behaves correctly in a single scenario, and may be iterated for a number of scenarios to show correct behavior. However, it is resolution constrained because it is limited to a certain set of inputs, and thus unable to verify a design is 100% error free. Thus, many tests can show correct behavior in many scenarios, but only finitely many. As a result, in practical terms, there are design bugs that are very difficult to catch with testing.
On the other hand, with testing, determining which inputs break the model is also challenging because the algorithm would have to randomly determine those unknown input values. Most tests would indicate the system works correctly based on a predetermined set of inputs and possibly promote a false sense of security. So while testing may be computationally more achievable and scalable than formal verification, it is often unable to determine that a design is error free (e.g., 100% error free, within a certain number of standard deviations, etc.).
As a result, in general, efficiently and automatically verifying software that uses LUTs is beyond the capabilities of many existing techniques. For instance, one existing approach that proposes a solution for analyzing a model (embodying an air-flight controller) that contains LUTs is described by the publication “Formal Verification of ACAS X, an Industrial Airborne Collision Avoidance System,” by J. B. Jeannin, K. Ghorbal, Y. Kouskoulas, R. Gardner, A. Schmidt, E. Zawadki, A. Platzer (“Jeannin”), In EMSOFT, 2015. Jeannin's model can produce around a trillion proof cases. In light of this, Jeannin describes using an interactive theorem prover to manually simplify the model and infer sufficient conditions for elements of the lookup table to be safe, and then the elements of the lookup table are checked on a supercomputer. However, the approach proposed by Jeannin is not adequate because it is not automatic and it requires heavy intervention by a human user to decompose the proof before appropriate conditions on the LUTs can be derived. Additionally, Jeannin's approach is highly computationally expensive.
Some further solutions attempt to approximate LUTs, such as over-approximation approaches that abstract and refine an LUT using CEGAR loops. Under this approach, as shown in FIG. 2, an approximation of points is computed that yield the smallest error possible; upper and lower bounds 206 and 208 are offset to a certain degree using the approximation; and linearly interprets points 204 and verifies whether any of the points crosses the upper or lower bounds. If so, the approach selects a set of sample points, adds them to the LUTs, and repeats the procedure until the problematic areas are fit.
However, since approximations, such as 204, often have conflicting requirements (e.g., low error rate and low complexity), the CEGAR loop-based solutions are generally unable to producing accurate functions that reliably approximate an LUT, particularly since industrial LUTs frequently have sharp corners 210 that suggest a hard transition between different regimes of operation, as shown in FIG. 2. Thus, current simple approximating functions, which are generally smooth, generally fail to adequately fit a sharp corner (e.g., with a smooth function), and generally result in using complex functions having high arithmetic complexity, such as polynomials of high degree, which are difficult to analyze efficiently.
Further, existing solutions that rely on abstractions, such as CEGAR loop-based ones, typically search the entire LUTs to determine whether a given point in an abstraction is acceptable, which is also not practicable at scale for the same reasons as those mentioned above.