1. Field of Invention
The present invention relates generally to data type inferencing of scalar objects, and more particularly, to one application of inferencing data types of message components to generate rules which block illegitimate messages from passing through application proxies and gateways.
2. Background of the Invention
Corporations are rapidly deploying web-based applications to automate business processes and to facilitate real-time interaction with customers, business partners and employees. Highly vulnerable to malicious hackers, web applications provide an entry point through which sensitive data can be accessed and stolen. Given the vulnerability of web applications, establishing a web application protection is critical for any enterprise that is exposing sensitive data or transaction systems over the Internet.
Firewalls are an essential component in a corporate entity's network security plan. They represent a security enforcement point that separates a trusted network from an untrusted network. Firewalls determine which traffic should be allowed and which traffic should be disallowed based on a predetermined security policy.
Firewall systems designed to protect web applications are known. They are commonly implemented as application proxies or application gateways. An application proxy is an application program that runs on a firewall system between two networks and acts as an intermediary between a web client and a web server. When client messages are received at the firewall, the final server destination address is determined by the application proxy software. The application proxy translates the address, performs additional access control checking, and connects to the server on behalf of the client. An application proxy authenticates users and determines whether user messages are legitimate.
Application proxies and gateways, however, are not designed to determine whether messages and their components are of a proper data type. This functionality is expected to be performed by web applications. For example, when an XML message is sent across a network, a web application extracts components of the message, typically in the form of field name-value pairs. The web application then determines whether the values are of an expected data type using type-checking rules as is well known in the art. If the web application determines that a value is not of a valid data type, as specified by the application, the web server does not process the message.
Existing methods that rely on verifying data types of message components, however, have a number of shortcomings. First, since application proxies and gateways are not configured to verify the data types of message components, messages containing components of an invalid data type will still be allowed to pass through application proxies and/or security gateways. Since some of these messages are more likely to represent malicious attacks rather than legitimate requests, failure to verify data types by application proxies and gateways results in allowing illegitimate traffic to enter web servers.
Secondly, the schemes used by a web server to verify data types of message components are manually provided by developers. Therefore, security policies are not always consistently applied to all incoming traffic received by web applications. Failure to check for valid data types of incoming traffic can result in buffer overflows, or other application security breaches. Further, these approaches may not reflect the dynamic behavior of the application traffic. In addition, the security policies are often overly broad. For example, if a policy specifies that “String” is a data type of a password field, then any alphanumeric value entered into the password field will satisfy this data type. However, if a policy specifies that “INT” is a data type of the password field, then only integers can be entered into the password field. Thus, data type “String” is a more broad (or less restrictive) data type than “INT”. Requiring a data type of the password field to be “INT” instead of “String” reduces a number of options presented to an intruder with respect to guessing a proper data type of the password field. Requiring a data type of the password field to be “String”, in contrast, provides ample opportunities for an intruder to correctly guess the data type of the password field, thereby increasing a number of illegitimate messages entering a web application.
Accordingly, what is needed is a technique that automatically learns characteristics of the application behavior to generate rules that would prevent malicious traffic from passing through application proxies and gateways. What is also needed is a technique that defines a data type of a message component as narrowly as possible, thereby making it more difficult for an intruder to perpetrate an attack while conforming to the proper data type of a message component.