1. Technical Field
The present invention is generally related to the field of techniques to prevent input injection attacks to web applications, and more specifically related to the field of techniques of using dynamic tainting to prevent injection attacks in web applications.
2. Prior Art
Web applications are becoming an essential part of our everyday lives. As web applications become more complex, the number of programming errors and security holes in them increases, putting users at increasing risk. The scale of web applications has reached the point where security flaws resulting from simple input validation errors have became the most critical threat of web application security. Injection vulnerabilities such as cross site scripting and SQL injection rank as top two of the most critical web application security flaws in the OWASP (Open Web Application Security Project) top ten list [25].
Researchers have proposed many other techniques against web injection attacks. Dynamic tainting techniques [9, 11, 23, 24, 26, 27, 38] have the most similarity to our technique. Dynamic tainting are runtime analysis techniques which generally involve the idea of marking of every string within a program with taint variables and propagating them across execution. Attacks are detected when a tainted string is used as a sensitive value. As discussed herein, the difference between our technique compared to traditional dynamic tainting techniques is that complementary character coding provides character level taint propagation across component boundaries of web applications without the need of code instrumentation and its overhead. Another difference is that while previous dynamic tainting techniques implement taint sinks using code instrumentation to detect attacks, our technique delegates enforcement of the security policy to the parser of each component.
Sekar proposed a technique of black-box taint inference to address some of the limitations with dynamic tainting [28], where the input/output relations of components are observed and maintained to prevent attacks. Su and Wassermann provided a formal definition of input injection attacks and developed a technique to prevent them involving comparing parse trees [30]. Bandhakavi, Bisht, Madhusudan, Venkatakrishnan developed CANDID [3], a dynamic approach to detect SQL injection attacks where candidate clones of a SQL query, one with user inputs and one with benign values, are executed and their parse trees are compared. Louw and Venkatakrishnan proposed a technique to prevent cross site scripting [20] where the application sends two copies of output HTML to a web browser for comparison, one with user inputs and one with benign values. Bisht and Venkatakrishnan proposed a technique called XSS-GUARD [4], in which shadow pages and their parse trees are being compared at the server. Buehrer, Weide, and Sivilotti developed a technique involved with comparing parse trees [6] to prevent SQL injection attacks.
Static techniques [2, 10, 13, 16, 19, 31, 34, 35] employ the use of various static code analysis techniques to locate sources of injection vulnerabilities in code. The results are either reported as output or instrumented with monitors for runtime protection. Because of the inherently imprecise nature of static code analysis, these techniques have the limitations of false positives. They also suffer from scaling problems when run with real world applications. Techniques which involve machine learning [12, 33] also inherently have the limitations of false positives and their effectiveness are dependent on their training sets. Martin, Livshits, and Lam developed PQL [21], a program query language that developers can use to find answers about injection flaws in their applications and suggested that static and dynamic techniques can be developed to solve these queries.
Boyd and Keromytis developed a technique called SQLrand [5] to prevent SQL injection attacks based on instruction set randomization. SQL keywords are randomized at the database level so attacks from user input become syntactically incorrect SQL statements. A proxy is set up between the web server and the database to perform randomization of these keywords using a key. Van Gundy and Chen proposed a technique based on instruction set randomization called Noncespaces against cross site scripting [8]. Nadji, Saxena and Song developed a technique against cross site scripting called Document Structure Integrity [22] by incorporating dynamic tainting at the application and instruction set randomization at the web browser. Kirda, Kruegel, Vigna and Jovanovic developed Noxes [18], a client side firewall based approach to detect possibilities of a cross site scripting attack using special rules. Jim, Swamy, and Hicks proposed a cross site scripting prevention technique called browser enforced embedded policies [15] where a web browser receives instructions from the server over what scripts it should or should not run.
Currently, web applications are vulnerable to injection attacks, such as SQL injection and cross site scripting, in which malicious uses enter inputs that are interpreted as executable code by some web component. Such attacks can lead to corruption of databases or theft of sensitive information. These vulnerabilities rank among the top security problems.
Current practice requires application developers to check and sanitize inputs to guard against injection attacks. This is very error prone. Several research efforts have attacked the problem with such techniques as static analysis and dynamic tainting. However these techniques have various limitations as described above.
Thus it can be seen that improved and new methods for preventing the effect of injection attacks on web applications are desirable.