Software vulnerabilities are primary sources for many types of software attacks. The majority of software vulnerabilities result in the program execution deviating from the software developer's original intent. Currently, it is not possible to eliminate all bugs or design flaws using current software technologies. For software applications that accept external data, these bugs or flaws can be exploited to allow malicious users to gain access to sensitive data. These types of attacks are called control flow attacks. Common examples of control flows attacks are stack buffer overflow attacks, format string and injected code attacks.
Control flows are well studied to detect control-flow attacks. A typical approach is to use static analysis techniques to construct a control flow model for binary modules and then check the control flows against the model at run time. This approach is feasible for checking the control flows in a single module, but impractical for the programs with multiple modules, such those running on Microsoft® Windows.
Various possibilities have been researched to enforce keeping program execution consistent with the developer's intention. One such approach is to extract a model (such as system call sequences) from the source code and check the execution against the model. A fine-granularity model built upon the branches of program execution can also be extracted to act as a checking model. Another approach is to extract a model of system calls and then check the trace of system calls when the program is running. However, in real-world applications the source code or symbolic information is not always available. In this case, the model can be extracted statically from binary executables using disassembly technology. However, the challenge is to guarantee the accuracy of disassembly and to recover the application semantics (in order to lower false alarms). Because it is not always possible to determine the control transfer paths, especially due to indirect branch instructions, the model might be imprecise or the checking rules have to be loosened. Therefore, it is crucial to construct a model or a set of rules which can not only conform to all legitimate control transfers, but also identify possible attacks.
Modern software frequently consists of many modules generated by different software providers. Interface technology loosely connects these modules so that even new modules can be integrated into the existing software. As a result, many existing control flow enforcement approaches can not be applied to such software. Another obstacle is that commercial software or freeware often have their binaries obfuscated so that the checking model can not be accurately extracted with static analysis techniques. The large amount of false positives associated with these approaches makes them impractical.
There are many techniques that attempt to deal with buffer overflow attacks. Some of these techniques require the program to be recompiled or source-modified. At least one technique uses a shadow call stack to detect the case of stack smashing. For real-world applications, however, a strict shadow call stack is insufficient because the non-standard control transfers such longjmp and exception handling (and some obfuscated functions in commercial software) can break the call/return pairing and lead to false positives.
There are also many different approaches to monitor the control flows of a program. One approach uses a program's static control flow graph at the system-call level to implement a host-based intrusion detection system. Another approach uses an interpreter to dynamically load the binary code and check and execute the code. This approach ensures that the destination of a control transfer is to a basic block that is loaded from the disk and not modified. This approach can prevent code injection attacks, and also ensures that a control transfer to a library can only go to an exported entry point, and thus prevent some existing code attacks. Yet another approach proves that control-flow models are basically more precise than system-call sequence models for intrusion detection systems. This approach implements an external monitor with binary rewriting technique to check against a static control flow model.
Other approaches use a binary rewriting technique to monitor control flows. Binary rewriting is a complex technique that first disassembles the code and then modifies the branch instructions to redirect the control to the supplied functions. In addition to the binaries themselves, some approaches require additional symbolic information. However, this information is not always available for monitored programs. Another approach uses a combination of static and dynamic analysis to rewrite the branch instructions so that the checking logics can be enforced. Yet another approach employs a different approach to effectively detect the external data being executed. This approach keeps track of propagation of untrusted data so that if the data is used in dangerous ways (for example, if the data is executed as code), then the detector can stop it and raise an alert. The trace information can be further used to generate a certified signature. The advantage of this approach is that there are no false positives. One problem, however, with this approach is that it incurs a significant performance overhead because it requires keeping track of data propagation.