The present invention relates generally to software applications and, more particularly, to mechanisms for securing Web-based (Internet) applications. Note that the terms “Java” “JavaCC” and JavaScript” used herein are trademarks of Sun Microsystems.
As more and more services are provided via the World Wide Web, efforts from both academia and industry are striving to create technologies and standards that meet the sophisticated requirements of today's Web applications and users. In many situations, security remains a major roadblock to universal acceptance of the Web for all kinds of transactions. According to one report, during 2002 there was an 81.5% increase in documented vulnerabilities overall, a large portion of which were vulnerabilities associated with Web applications. The report's authors pointed out that the driving force behind this trend is the rapid development and deployment of remotely exploitable Web applications.
Current technologies, such as anti-virus software programs and network firewalls, offer comparatively secure protection at the host and network levels, but not at the application level. However, when network and host-level entry points become relatively secure, the public interfaces of Web applications are likely to become focus of security concerns.
Cross-site scripting (XSS) is perhaps the most common Web application vulnerability. FIG. 1(a) shows an example of an XSS vulnerability in an application written in PHP (PHP: Hypertext Preprocessor, one of the most widely-used programming language for Web application development) code. Values for the variables $month, $day, and $year in the application code of FIG. 1(a) come from HTTP requests and are used to construct HTML output sent to the user. Attackers seek to make victims open attacking URLS. One strategy is to send an e-mail containing javascript that secretly launches a hidden browser window to open this URL. Another is to embed the same javascript inside a Web page; when victims open the page, the script executes and secretly opens the URL. Once the PHP code shown in FIG. 1(a) receives an HTTP request for the URL, it generates the compromised HTML output shown in FIG. 1(b).
The compromised output contains malicious script prepared by an attacker and delivered on behalf of a Web server. HTML output integrity is hence broken and the Javascript Same Origin Policy is violated. Since the malicious script is delivered on behalf of the Web server, it is granted the same trust level as the Web server, which at minimum allows the script to read user cookies set by that server. This often reveals passwords or allows for session hijacking; if the Web server is registered in the Trusted Domain of the victim's browser, other rights (e.g., local file system access) may be granted as well.
Considered more severe than XSS attacks, SQL injection vulnerabilities occur when untrusted values are used to construct SQL commands, resulting in the execution of arbitrary SQL commands given by an attacker. An example of an SQL vulnerability is illustrated in FIG. 2. Therein, $HTTP_REFERER is used to construct a SQL command. The referer field of a HTTP request is an untrusted value given by the HTTP client; an attacker can set the field to: ‘);DROP TABLE (’ users This will cause the code in FIG. 2 to construct the $sql variable as:                INSERT INTO tracking_temp VALUES(″);        DROP TABLE (‘users’);Table “users” will be dropped when this SQL command is executed. This technique, which allows for the arbitrary manipulation of a backend database, is responsible for the majority of successful Web application attacks.        
Yet another type of Web application vulnerabilities are general script injections. General script injections occur when untrusted data is used to call functions that manipulate system resources (e.g., in PHP: fopen( ), rename( ), copy( ), unlink( ), etc) or processes (e.g., exec( )). FIG. 3 presents a simplified version of a general script injection vulnerability. Therein, the HTTP request variable “csvfile” is used as an argument to call fopen( ), which allows arbitrary files to be opened. A subsequent code section delivers the opened file to the HTTP client, allowing attackers to download arbitrary files.
The recognition of the significance of these types of attacks is reflected by the recent burst of efforts that aim to improve Web application security via numerous different approaches. In their article “Abstracting Application-Level Web Security”, Proc. 11th Intl Conf World Wide Web (WWW2002), Honolulu, Hi., Scott and Sharp proposed the use of a gateway that filters invalid and malicious inputs at the application level. Additionally, most of the leading firewall vendors are also using deep packet inspection technologies in their attempts to filter application-level traffic.
Although application-level firewalls offer immediate assurance of Web application security, they have at least three drawbacks. First, application-level firewalls offer protection at the cost of expensive runtime overhead. Second, careful configuration by very experienced security experts are required for application-level firewalls to function correctly and offer proper protection. Third, application-level firewalls do not identify vulnerabilities, and therefore do not help improve the actual security (or quality) of the Web application. Other techniques provide Web application security assessment frameworks that offer black-box testing (penetration testing) to identify Web application vulnerabilities. However, such testing processes may not identify all vulnerabilities, and they do not provide immediate security for Web applications.
Another possible mechanism for Web application security are software verification (static analysis) techniques which identify vulnerabilities of an application at compile time by analyzing source code. Software verification techniques avoid most of the limitations of application-level firewalls and black-box testing, but typically have their own drawbacks. Specifically, software verification techniques typically (1) cannot offer immediate protection (while e.g., application-level firewalls can), (2) have a high false positive rate, (3) are not scalable and cannot handle large software programs, and (4) cannot offer counterexample traces, which is crucial in helping developers understand and fix the identified vulnerabilities.
Accordingly, it would be desirable to provide methods and systems which enable vulnerabilities in Web applications to be identified while at the same time providing immediate security for those Web applications, and overcoming the other limitations identified above.