Assembly language and machine code are low-level, difficult-to-understand burdens for cyber security analysts who reverse engineer binary applications to find hidden malicious code. The problem with reading low-level languages is the complexity of capturing semantics from the code. This results in a time-consuming process for analyzing malware, whereas malicious software can be generated automatically. The imbalance between generating and detecting malevolent software currently puts the cyber security industry at a persistent disadvantage.
Software development is a complex process. Software errors or ‘bugs’ are unavoidable in developing applications. Testing procedure becomes increasingly important to detect the existing and potential bugs. Designing comprehensive test suites, however, is infeasible for any sizable application. Detecting bugs through human inspection of source code is hard. Detecting bugs without source code is even harder due to the complexity and challenge of reasoning about what low-level assembly code and machine code is doing. Automated bug detection techniques currently are limited to mechanical detection of potentially problematic syntax (not semantics), limited to a single type of bug, and/or limited to heuristic algorithms that produce significant false positives and false negatives.
What is needed is a method and system to automate reverse engineering to capture and perform static analysis on all runtime code for an executable; get instruction traces without instrumenting and running the target system; automate reasoning and pattern recognition against the semantics of executable binaries; determine all variables and code branches that are affected by program input; and find feasible application inputs to reach a desired Point-of-Interest (POI) in the executable. Additionally, to find software bugs or vulnerabilities, harden software, test software, and understand and fix software where the original source code is missing.