Buffer overflow occurs when more data is written into a buffer than the buffer capacity, causing extra data being written into memory adjacent to the buffer. If the adjacent memory before being overwritten has stored information (such as the pointer to the previous frame and return address) that is critical for the operating system (OS) to correctly execute programs, buffer overflow may cause unpredictable behavior. In a buffer overflow attack, the attacker carefully crafts his input data to vulnerable software so that the unpredictable behavior is that the OS executes his malicious code embedded in the overflow data with the privilege of the vulnerable software.
Although more than forty years have passed since the buffer overflow technique was first documented by Anderson in 1972 and almost thirty years have passed since the buffer overflow technique was first exploited by the infamous Morris worm in 1988, buffer overflow remains the most common type of software vulnerabilities, as shown in the recent studies of software vulnerability databases, and it is likely to remain so for many years to come. First, much existing software has buffer overflow vulnerabilities, which are unknown to its vendors and users, but will be exploited by attackers sooner or later. Second, much future software will still be written by programmers who are not well trained in software security. The inherently unsafe languages C and C++ will remain popular languages for performance and backward compatibility reasons. Although it is known how to avoid buffer overflow problems in writing programs for many years, having such knowledge is far from enough to thwart the rampant buffer overflow issue.
There are two general approaches to identifying buffer overflow vulnerabilities: static program analysis and dynamic execution analysis. The static program analysis approach scans software source code discover the code segments that are possibly vulnerable to buffer overflow attacks. Each vulnerability warning needs to be manually inspected to check whether each warning is indeed a true vulnerability. The key advantage of such schemes is that buffer overflow vulnerabilities can be discovered and fixed before software deployment. The key limitation of existing such schemes is that the reported buffer overflow vulnerabilities contain too many false positives fundamentally due to the lack of software execution information (such as which code segment is reachable in execution and which execution path will be followed) and each false positive wastes a huge amount of human effort on manual source code inspection. The dynamic execution analysis approach inserts special code into software so that buffer overflow occurrences can be detected and properly processed such as terminating software execution. The key advantage of such schemes is that they rarely have false positives because they have software execution information. The key limitation of such schemes is that they incur an excessive amount of performance overhead (as much as 1,662 times slower) because the inserted code needs to be executed for each buffer operation and function call.
In order to address the limitations discussed above, BovInspector is invented, which is the first framework for automatic inspection of buffer overflow vulnerability warnings output by existing static program analysis tools.