Web applications continue to offer more features, handle more sensitive data, and generate content dynamically based on more sources as users increasingly rely on them for daily activities. The increased role of web applications in important domains, coupled with their interactions not only with other web applications but also with users' local systems exacerbates the effects of bugs and raises the need for correctness.
Testing is a widely used approach for identifying bugs and for providing concrete inputs and traces that developers use for fixing bugs. However, manual testing requires extensive human effort, which comes at significant cost. Additionally, quality assurance (QA) testing usually attempts to ensure that the software can do everything it ought to do, but it does not check whether the software can do things it ought not to do; such functionality usually constitutes security holes.
Traditional work on testing has generated random values as inputs. Randomly generated input values will often be redundant and will often miss certain program behaviors entirely. Test input generation that leverages runtime values, or concolic testing, has been pursued by multiple groups. These approaches gather both symbolic constraints and concrete values from program executions, and use the concrete values to help resolve the constraints to generate the next input. Previous work on concolic testing handles primarily constraints on numbers, pointer-based data structures, and thread interleaving. This is appropriate for the style of programming that languages like C and Java encourage, but scripting languages, especially when used in the context of web applications, encourage a style in which strings and associative arrays play a more central role.
Others have augmented concolic testing to analyze database-backed Java programs, including support for string equality and inclusion in regular languages specified by SQL LIKE predicates. They support a form of multi-lingual programming in which Java programs generate SQL queries. However, this approach does not support any string operations. They check for the same properties as standard concolic checkings.
Thus, previous work on concolic testing has helped to automate test input generation for desktop applications written in C or Java, but web applications written in scripting languages such as PHP pose different challenges.
First, PHP is a scripting language and not a compiled language. Such languages, especially in the context of web applications, encourage a style of programming that is more string- and array-centric as opposed to languages like Java where numeric values and data structures play a more central role. In the limit, scripting languages allow for arbitrary metaprogramming, although most PHP programs only make moderate use of dynamic features. Additionally, PHP web applications receive all user input in the form of strings, and many string manipulation and transformation functions may be applied to these values.
Second, in order for automatic test input generation to be useful, test oracles are needed that will identify when common classes of errors have occurred. Several common classes of errors in C programs are memory errors; Java has eliminated most memory errors, but Java programs may still have null-pointer dereference errors; and PHP programs are entirely free of memory corruption errors (barring bugs in the interpreter). Hence, other kinds of test oracles are needed.
Some previous work on web application testing has focused on static webpages and the loosely structured control flow between them (defined by links), and other work has focused on the server-side code, often carrying over techniques from traditional testing. Early work on web applications focused primarily on static pages and the coverage metric was page-coverage.
Other testing techniques that attempt to test the effects of input values on web applications, but they require interface specifications and cannot guarantee code coverage without extensive user interaction. In some cases, automated techniques derive the interface specifications and in others developers must provide them, but either way, the testing system essentially performs fuzz testing that may be constrained by user-provided value specifications. Other testing mechanisms provide more reliable code coverage, but they repeatedly prompt the user for new inputs, so they sacrifice automation.
Static analysis of web applications has been performed. However, those that have been proposed do no consider dynamically constructed string values, and thus, they can only check whether raw user inputs flow into sensitive sinks.
All of the techniques known to the inventors have limited effectiveness, because PHP supports dynamic features, in which the runtime system interprets data values as code, and dynamic features inhibit static analysis. The standard dynamic features PHP provides allow string values to specify: the name of a file to include, the name of a variable to read/write, the name of a method to invoke, the name of a class to instantiate, and the string representation of code to execute. All of the static analyses for PHP described above either fail on dynamic features, treat them optimistically (i.e., ignore them), ask the user to provide a value for each one, or do some combination of the three. Many PHP applications use dynamic features extensively, for example, to implement dynamic dispatch for dynamically loaded modules or for database handling code. On such code, static analysis fails to produce useful results.
In most real-world PHP programs, however, the values of interpreted strings come only from trusted values such as constant strings within the PHP code, for example in a factory pattern; column names from a known database schema; or field names from a protected configuration file. In such cases, the values of interpreted strings depend only indirectly on user input, and for any given run, the predicates on user inputs are not dynamically constructed.