The present invention relates to the field of computer system integrity and security. More particularly, the present invention relates to methods and apparatus for detecting a call to a library function that could overflow a buffer in the heap section of memory.
Buffer overflow attacks are a major cause of security breaches in modern computer systems. A buffer is a block of memory locations that are typically allocated to a particular program or function for use as a storage area. A buffer overflow (or xe2x80x9coverrunxe2x80x9d) occurs when a software function (also known as a routine, module, etc.) writes data beyond the boundaries of a buffer that is allocated to the function, thereby overwriting the content of the memory locations immediately before or after the buffer. Generally, a buffer overflow is undesirable because it corrupts memory and potentially generates a memory fault.
Heap smashing attacks and stack smashing attacks are two types of attacks that a malicious program may make that relate to violations of the memory integrity. Heap smashing attacks involve the heap section of the memory, and stack smashing attacks involve the stack section of the memory. The heap is an area of a memory, such as a Random Access Memory, that is dynamically allocated by programs at run-time to store. By contrast, the stack is an area of memory used to store objects related to a function call. For example, the program counter, return address of the function, and local variables of the function are stored on the stack. One type of heap smashing attack overruns a heap buffer. A heap overflow attack may be used to overwrite function pointers stored on the heap to redirect the program""s control flow. As can be appreciated, a heap smashing attack such as this may be dangerous. For example, a heap smashing attack may be used to gain root access to a remote computer by writing malicious shell script code contained in an attack packet on the heap of the server, thus allowing the attacker to gain root access when the script is executed by the server. In this way, a buffer overflow may be exploited maliciously to alter a program""s control flow and thus break the security of the computer system. A heap may be smashed due to an attack by a malicious attack program or due to an inadvertent error by the programmer.
Many systems are written in unsafe programming languages (such as C or C++) which are optimized for high performance but provide only limited error checking. Such languages may allow executing programs to call a xe2x80x9clibrary functionxe2x80x9d to perform an operation such as writing data to memory. A xe2x80x9clibraryxe2x80x9d is a collection of precompiled functions that a program can use. For example, the C library function xe2x80x9cstrcpyxe2x80x9d copies a source string pointed to by a first function parameter (also known as an argument) to a destination location pointed to by a second function parameter. Typically, a library function is stored in object format, and the program does not need to explicitly link the library function to every program that uses them (because the linker automatically looks in libraries for routines that it does not find elsewhere). In the MS-Windows(copyright) environments, for example, library functions generally have a xe2x80x9c.DLLxe2x80x9d extension. In unsafe programming languages, a library function may not have any built-in buffer overflow checks. In many cases, it is up to the programmer to check whether a destination buffer has sufficient memory space to accommodate a source string. Unfortunately, many existing programs often omit such buffer overflow checks. As discussed above, the absence of boundary checks may be exploited by attackers to gain unauthorized access to the computer system.
The fundamental solution to buffer overflow attacks relies on a safe coding style: a programmer could avoid unsafe library functions (like strcpy) and could perform careful boundary checks within any program that makes such calls. However, given the huge volume of existing programs, it is not possible to inspect and rewrite all of them to eliminate potential buffer overflow problems. In view of the effort needed to make sure that a string copy does not fail, for example, it is not surprising that many programs do not have these checks and are susceptible to buffer overflows. Furthermore, while there exist preliminary tools that detect a large class of buffer overruns statically, users might not want to wait until programs are fixed by their developers. Nor do users generally have access to the source code of commercial software. A solution to the heap smashing problem that does not need source code access is therefore highly desirable.
Embodiments of the present invention provide a fault-containment wrapper that effectively protects existing programs from heap smashing attacks caused by library function calls. This fault-containment wrapper provides the same functionality as the library function call but in addition does careful boundary checking. In an embodiment, shared library functions are wrapped such that the wrapped version first checks that a buffer contains sufficient space before calling the original unwrapped function. The buffer space may be kept track of by wrapping allocation functions (such as malloc, calloc, free, etc.) and keeping meta-data for each allocated buffer.
According to an embodiment of the present invention, every function call from a program to a library which could be exploited for heap buffer overflows (e.g., a call that requests writing of a data block to the heap) is intercepted and redirected to the fault-containment wrapper. The wrapper may then make sure that the library function call does not cause an access to any heap memory that is outside an allocated buffer. The present invention provides an approach to the buffer overflow problem that is transparent to existing programs and does not require access to the source code of such programs.
In an embodiment, instructions within the fault-containment wrapper determine whether performing a write request would smash the heap. In an embodiment, the fault-containment wrapper concludes that performing the write request would smash the heap if it is determined that writing the data block as requested would overflow a buffer in the heap. If performing the write request would not smash the heap, the fault-containment wrapper may cause the data block to be written as requested. On the other hand, if performing the write request would smash the heap, the fault-containment wrapper may execute an error handling procedure instead of writing the data block.
In further embodiments, the fault-containment wrapper concludes that performing the write request would smash the heap if either (1) the start address of the memory section where the data block is to be written is not part of a currently allocated buffer in the heap or (2) when this start address is within a buffer, if the data block""s size is greater than the size of the memory section extending from this start address to the end of the buffer. Further embodiments provide for methods of determining whether the destination start address is within a currently allocated buffer. For example, in one embodiment, a search of the heap is made for a meta-data field beginning at the destination start address and proceeding in one direction. The fault-containment wrapper of this embodiment may conclude that the memory address is not within any currently allocated buffer if the search reaches a heap boundary without finding a valid meta-data field. A potential meta-data field may be identified by finding a predefined marker in a memory location being examined during the search, and such a potential meta-data field may be confirmed as a meta-data field if the memory section contains a pointer to an entry in the buffer management table.
These and other embodiments are explained further below in the following detailed description.