1. Field of the Invention
The present invention relates to a technique for determining the validity of a string in a computer program code.
2. Description of the Related Art
As the Internet traffic increases, security risk has also increased. A typical risk is a cross-site scripting (XSS) attack, in which an attacker injects a malicious script across sites into web pages dynamically generated by an application program. Another risk is an SQL injection attack, in which an attacker injects an SQL statement into an existing SQL statement, causing the execution of the SQL statement, which is not expected by the application, to manipulate a database system in an unauthorized manner. To detect the XSS risk in a CGI program, for example, it is necessary to check whether a character ‘<’ is passed to a print function.
Japanese Patent Application Publication No. 2007-52625 discloses a system for constructing a parse tree by analyzing a source code to be tested by parsing means. In this published application, vulnerability detection means creating a dynamic inter-parameter transition database by following the parse tree. In addition, the system traces the transition of an external input in the source code to be tested from the inter-parameter transition database and gives a warning of a vulnerable portion, which matches a content registered in a vulnerability database containing functions vulnerable when an external input is used as a parameter. While the technique disclosed in the published application relates to testing the vulnerability of a source code, it pertains to the transition of an external input and does not pertain to a value of a string generated by a program.
Non-patent documents Christensen, et al., “Precise Analysis of String Expressions”, In SAS'03 Proceedings of Int'l Static Analysis Symposium, Vol. 2695 of LNCS, pp. 1-18, Springer-Verlag 2003 (hereinafter Christensen), Minamide, “Static approximation of dynamically generated Web pages,” Proceedings of the 14th int'l conference on World Wide Web table, pp. 432-441, 2005 (hereinafter Minamide), and Wassermann, et al., “Sound and precise analysis of web applications for injection vulnerabilities,” In PLDI'07 Proceedings of Programming Language Design and Implementation, 2007 (hereinafter Wassermann) disclose a static program analysis technique for inferring a value of a string generated at run time without executing a program. Typically, the static program analysis technique is used to detect security vulnerability by abstracting a string value using grammar (regular grammar or context-free grammar) and comparing the inferred string value with a safe or unsafe pattern prepared in advance.
A grammar-based approach, however, is limited in that it is difficult to modularize. Furthermore, the grammar-based approach is difficult to use for a retrospective analysis or to handle a relationship between a string index and a string value. For example, the string analysis in Minamide depends on transformations of the context-free grammar. Therefore, to modularize the string analysis in Minamide, it is necessary to calculate the composition of the transformations and to reverse the transformations to use the string analysis in for the retrospective analysis. In this manner, the modularized analysis and the retrospective analysis require an additional algorithm in an inference phase of the string analysis.
On the other hand, handling the characteristics of a string can be performed by using the monadic second-order logic (M2L) approach. According to the M2L approach, it is possible to perform the composition by using a simple logic operation (for example, ) without using any particular algorithm. A BDD-based algorithm for solving a M2L formula contributes to solving a problem of combinatorial explosion. MONA is an example of a program for solving M2L. The program is available at http://www.brics.dk/mona. An encoding method for a regular expression is described in Chapter 6.6 at http://www.brics.dk/mona/publications.html.
The technique described in Engelfriet, et al., “MSO definable string transductions and two-way finite-state transducers,” ACM Transactions on Computational Logic, Vol. 2, Issue 2 (April 2001) pp. 216-254 solves the problem of string concatenation or the reverse if the string operation in the program can be defined or approximated by a string conversion definable by the monadic second-order logic (MSO). From the viewpoint of the static program analysis, however, this paper does not include any description of an algorithm for abstracting a program by using M2L.
There are other ways of verifying a program using M2L such as the technique described in Moller, et al., “The pointer assertion logic engine,” ACM SIGPLAN Vol. 36, Issue 5 (May 2001) pp. 221-231.
The above-described techniques, however, do not deal with the verification of a value of a string generated by a program.