(1) Field of Invention
The present invention relates to a security system and, more particularly, to a system for detecting source code security flaws through analysis of code history.
(2) Description of Related Art
Secure information flow is an important foundation to most security and computer systems. As static analysis for security becomes standard in the systems development process, it becomes paramount that tools be tailored to the security demands in any product line that interfaces with third-party software.
In information flow security, prior art falls into three general categories: dynamic taint analysis (see, for example, the List of Cited Literature References, Literature Reference Nos. 2, 7, 11, 13, and 18), secure information flow compilers (see Literature Reference Nos. 6, 12, 14, 15, and 17), and information flow security libraries for general-purpose programming languages (see Literature Reference Nos. 3, 8, and 16). While the aforementioned prior art does provide some level of security, none of those approaches incorporate any kind of code history analysis. As a result, such approaches require software developers to manually write the security policy, which significantly increases the code development cost.
The prior art also includes software repository data mining techniques (see Literature Reference No. 19) focused on analyzing textual comments, bug database information for security-related changes, and coarse-grain criteria such as the number of lines of code changed, not the content of the source code itself. Such techniques are disadvantageous because they are based upon ancillary and may not necessarily reflect the structure and content of the program itself
As another example, Chugh et al. (see Literature Reference No. 1) developed a system for staged analysis of information flow for JavaScript. This system checks the information flow of incomplete programs. When JavaScript code is dynamically loaded, it is checked. Software projects usually consist of multiple interdependent files. The staged analysis approach enables information analysis at the per file level. Since the source in the code repository is not guaranteed to compile or to be complete, code history must account for incomplete source such as a revision with a missing file or an incomplete file. However, a code history-based approach (according to the principles of the present invention) goes beyond the staged analysis approach in that it can handle changes to the information flow policy as opposed to attempting to maintain a single monolithic policy.
Most techniques for exploiting software repository information focus on statistical techniques such as using code churn (number of added and deleted lines of code) or metrics (keywords in comments) to predict bugs (see Literature Reference No. 4). The main shortcoming of these techniques is that they do not account for the actual characteristics of the bugs in question. In contrast, code history-based analysis (as described in further detail below) can pinpoint bugs and the information flow implications of such bugs.
Thus, a continuing need exists for a code history-based approach to detect source code security flaws.