Cross-Site Scripting (XSS) is a pervasive problem in which, in one form, user input from a client is reflected or echoed back as-is into the returned hypertext markup language (HTML) output from a server. This type of vulnerability means that an attacker can craft a very specific payload, which, when echoed back in the output, includes a script that is executed in the browser. Since the script is returned as a part of the output, the script has access to all cookies and data related to the domain (for example, a web site) that returned the output. Sending the crafted payload to a remote victim will cause the vulnerable site to return the malicious script, allowing the malicious script to fully interact with the vulnerable site, without restrictions applied by a security mechanism of a same origin policy of the browser.
A simplest solution for cross-site scripting is input validation that disallows any character or pattern that may result in a script action in the returned output. Input validation is performed using a positive security model, defining what is allowed, a negative security model, defining what is not allowed, or a combination of models. In both cases, a request that does not meet defined conditions is denied or modified.
In the negative security model, known patterns of attacks for blocking are typically defined. For example, input with the value of “<script>”, which is a known attack payload can be blocked. The main challenge in this model is remaining current with all possible exploit patterns that exist, especially when using generic patterns. Negative security input validation will typically not achieve complete protection, and thus if possible, developers are advised to use positive security model.
In the positive security input validation model, a definition may be specified that only alpha numeric characters are permitted, because of the difficulty in successfully exploiting a cross-site scripting issue with only alpha numeric chars. The main challenge with positive security input validation model is a possibility of breaking the application. For example, the logic of the application may require use of possibly dangerous characters, leading to an increasingly difficult task to cover all possible cross-site scripting attack vectors without breaking the function of the application.
An alternate solution for cross-site scripting is to sanitize output by encoding the output. Encoding the input before the input gets added to the output will ensure the browser will not process the received input in the output as a script. Because the input that is echoed back can be added to the output in many different contexts including part of a returned attribute, a value of a tag, within a comment, or within a script tag, the encoding of the input needs to match the specific context.
For example, consider a request in the following code snippets with the URL:
http://server/welcome.jsp?name=John
The example returns the name parameter, as-is, in two different contexts, in an output such as:
<html><body><H1>Hello, John</H1><script><![CDATA[var user = “John”;//...]]></script></body></html>
To exploit the cross-site scripting in the “Hello” statement, the attacker would send a request such as:
http://server/welcome.jsp?name=John<script>hijack( )</script>
The request is then returned as:
<html><body><H1>Hello, John<script>hijack( )</script></H1><script><![CDATA[var user = “John<script>hijack( )</script> ”;//...]]></script></body></html>
The script will be successfully exploited due to the first echoed back content, and the value in the attribute will be a simple value. The solution to this problem would be the use of the HTML encode technique, and specifically replacing the symbol “<” with the symbol of “&lt”.
To exploit the cross-site scripting in the second case, an attacker would send a different request, for example:
http://server/welcome.jsp?name=John”;hijack( )//
The request is then returned as:
<html><body><H1>Hello, John“;hijack( )//</H1><script>var user = “John”;hijack( ) //”;//...</script></body></html>
The script in the example is successfully exploited because of the second instance of echoed back content and the newly added JavaScript™ code (Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates). The welcome text remains as plain text. The solution to the problem of the example would be to use the STRING encode technique, which will replace the double-quotes with \” (a backslash+double-quotes).
The solutions just described are valid and well defined, but require modification of the application code, which is a costly and sometimes difficult process, especially when dealing with legacy applications, third party modules, and outsourced applications. Code modification requires time, leaving the system exposed, when already deployed, until the code is fixed.
Automated tools, such as a firewall for a Web application, address this problem by providing an input sanitization solution. Input sanitization typically uses negative security patterns, including elaborate regular expressions that attempt to rule out all possible payloads. This approach is useful, but only a partial solution, because possible variations of cross-site scripting attacks are practically infinite. The tools often offer general negative security rules stating which patterns are not allowed, and allow the user to specify more concrete security rules, but these rules are hard to maintain. Therefore there is a need for an automated solution to injection attacks such as cross-site scripting attacks.
US 2008/0263650 discloses efficient cross-site attack prevention, in which web pages are store on a site. The web pages are organized into entry pages that do not accept input and protected pages that are not entry pages.
US 2007/0113282 discloses a device for receiving and processing data content having at least one original function call including a hook script generator and a script processing engine. The hook script generator is configured to generate a hook script having at least one hook function. Each hook function is configured to supersede a corresponding original function. The hook function provides a runtime detection and control of the data content processing.
US 2004/0260754 discloses a method for mitigating cross-site scripting attacks. When an HTTP request is received from a user computer, the HTTP request is evaluated to determine if it includes a script construct. Particularly, data derived from an outside source that is included in the HTTP request is examined for the presence of script constructs. The presence of a script construct indicates that a cross site scripting attack is being executed and the server computer is able to prevent the attack from being carried out.