Computer software may be written in any number of programming languages. Such languages include C, C++, Java, Cobol, ML and others. Increasingly, many software languages are robust, such as Java and ML, meaning that they operate in an environment so removed from the machine level resources that an executed program cannot typically corrupt the host machine memory or crash the host machine. While robust languages provide protection against host machine corruption and crashing, robust languages tend not to be particularly efficient.
By contrast, the C-family of languages, including C and C++ (hereinafter referred to as “C languages”), can be much more efficient, although not robust. To this end, the C languages allow extensive access, control and manipulation of host machine resources such as memory. As a consequence, C language programs can be prone to buffer overruns and other memory misallocation errors that crash or hang the program.
Although C language programs lack the robustness of programs written in Java and other such languages, the need for efficiency in programming has resulted in the continued extensive use of the C languages. To capitalize on the potential efficiencies, C language programmers attempt to optimize memory usage based on knowledge of the application being written. For example, extensive control over memory also supports memory-mapped I/O, which is important for system level programming. Such extensive control, however, increases the potential for memory misuse errors, and even provides potential security issues.
While extreme care may be used to ensure that a particular C language application does not misuse or improperly overwrite memory, a problem arises from the fact that nearly all C language programs incorporate standard (or non-standard) library functions which are not written by the application developer. In particular, as is known in the art, C language development kits make available large numbers of common functions in the form of libraries. For example, one library of functions may include input/output oriented functions, while another library includes string handling functions. Multiple libraries of such functions are often “included” in application software.
The problem with standard library functions is that they may or may not have had extensive testing to determine whether they are sufficiently robust so as to avoid crashing and security breaches. Accordingly, incorporation of standard C language library functions may make an otherwise robust software application prone to memory allocation errors.
While it is possible to develop software without using standard library functions, thereby allowing the developer absolute control the robustness of the application, such a scenario is impractical. The high labor cost associated with software development does not reasonably allow for each new application to recreate the same standard functions. Thus, the use of commercial off the shelf library C language functions is a necessity.
Solutions have been proposed to overcome the problems posed by standard library functions. Such solutions envision implementation of a software wrapper around certain C language functions. The wrappers intercept calls to the function, and then determine whether any of the inputs to the function was invalid. Descriptions of such wrappers may be found in Fetzer, Christof and Xiao, Zhen “Detecting Heap Smashing Attacks Through Fault Containment Wrappers”, Proceedings of the 20th IEEE Symposium on Reliable Distributed Systems (IEEE, October, 2001), which is incorporated herein by reference.
By intercepting function calls and determining if variables passed to the function are invalid, the wrapper can prevent execution of the function if execution of the function with the passed variables could cause a security breach or system failure. However, the development of such wrappers for each function involves extensive modeling and analysis. As a consequence, attempting to develop custom wrappers for the multitude of functions in various libraries can be cost prohibitive.
Accordingly, there is a need for a method to ensure robust operation of potentially non-robust software libraries that does not suffer from the drawback of requiring extensive modeling and analysis.