Web applications must be accessible to users yet impervious to attack from malicious hackers, or from inadvertent users whose desktop computers have been compromised by worms and viruses.
Consider a web application such as online banking that is accessible to a large number of users. The application infrastructure is installed at a data center and encased from the internet by a network firewall. The network firewall disables traffic on all TCP/IP (Transmission Control Protocol/Internet Protocol) ports except for the ports that carry HTTP (HyperText Transfer Protocol) and HTTPS (an HTTP that requires a Secure Sockets Layer) traffic, ports 80 and 443 typically. Malicious attackers may mount attacks via HTTP or HTTPS and the network firewall cannot protect against those. In addition, users with compromised desktops can inadvertently attack the application when they visit it. In either case, the operator of the application must take steps to protect the application from attack.
Some of these attacks may be infrastructure attacks. That is, the attacks target vulnerabilities in the infrastructure of the application. For example, the web server running the application may have vulnerabilities potentially subject to attacks. This was the case with recent worms such as CodeRed or Nimda. In other cases, the application itself could have vulnerabilities. For example, requesting a malformed URL (Uniform Resource Locator) of an application could cause the application to become unstable or vulnerable to unauthorized access to confidential information. Protecting the application and/or its infrastructure from these sorts of attacks is the subject of many commercial products such as the ones from Teros, Sanctum and F5. These and other projects form a broad class of products called application firewalls.
An application firewall must be able to distinguish between authorized access and unauthorized access, and must be able to distinguish between a valid URL and an invalid URL. Current technology aims to distinguish between valid and invalid URL's by employing a training phase and a protection mode. During the training phase, the system learns a valid range of values of URL's, including parameters and cookie values associated therewith. Subsequently, during the protection mode, any URL that falls outside the learned range is denied. This approach is prone to false positives because legitimate users requesting legitimate URL's may be denied access, since it is impossible to capture the full range of valid URL's in any reasonable training period.
HyperText Transfer Protocol (HTTP) is the underlying protocol used by the World Wide Web. HTTP defines how messages are formatted and transmitted, and what actions Web servers and browsers should take in response to various commands. For example, when a user enters a URL in a browser the browser sends an HTTP command to a Web server directing it to fetch and transmit the requested Web page. HTTP is called a stateless protocol because each command is executed independently, without any knowledge of the commands that came before it.