The benefits derived from using sophisticated software come with the risk of utilizing vulnerable code in critical systems. Vulnerable code allows outside actors to interfere with the normal operation of current systems. To help defend against these outside actors, it is desirable to identify vulnerabilities in existing systems, including systems where the original source code is not available for study.
Tools have been developed to help programmers identify vulnerabilities in existing systems. One category of these tools performs static analysis to verify the properties of software in safety-critical computer systems. For example, the medical, nuclear, and aviation industries have adopted static code analysis to address the quality of the increasingly sophisticated software employed in these fields. Static analysis does not require the programs actually be executed and can instead analyze the source code or the object code. Dynamic analysis, on the other hand, executes the software in a real or virtual processor.
Open source programmers have provided the low level virtual machine (LLVM) compiler infrastructure project with an ability to generate an intermediate representation (LLVM IR) of the software code. The LLVM IR provides a well-defined low-level programming language, similar to assembly, that is a reduced instruction set computing (RISC) instruction set designed to abstract away the details of the target machine architecture. The LLVM project includes tools that convert the LLVM IR into machine-dependent assembly code for the target platform. LLVM IR is also useful because it can be used with a variety of existing tools.
One example of such a tool is KLEE which is a symbolic virtual machine built on top of the LLVM compiler infrastructure. KLEE includes components that can execute the LLVM bitcode modules with support for symbolic values, and a POSIX/Linux emulation layer that provides additional support for making parts of the operating system symbolic. Another example is the low-level bounded model checker (LLBMC) which is a static software analysis tool for finding bugs in low-level system code. Leveraging these existing tools can reduce the amount of duplicate work and facilitate the discovery of aspects of the code which may deviate from the desired operation of the program.
Applicants have determined that it would be desirable to convert the binaries from a variety of existing systems into an intermediate representation that can then be used for automatic discovery of software vulnerabilities, and to automate portions of the vulnerability discovery process so that the converted binaries from the variety of existing systems can be easily and quickly analyzed.