As it is generally known in the area of computer security, null-byte injection is a type of computer network attack in which a text string supplied by a client contains an embedded character interpretable as having a physical value of zero. Such a text string can be provided, for example, in the URI (Uniform Resource Identifier) of an HTTP (HyperText Transfer Protocol) GET request, in an HTTP header, in the message body of an HTTP POST request, or in a JSON (JavaScript Object Notation) or XML (eXtensible Markup Language) file.
The injected null byte can either be a literal, un-encoded null-byte (a “naked” null-byte), or it can be obfuscated by encoding it in numerous different ways, using one or more standard character encoding methods, such as C backslash-escape sequence encoding, URL (Uniform Resource Locator) percent encoding, and/or XML numeric character reference encoding. C backslash-escape sequences are used in C and C++, and also in many C-like interpreted languages, such as Java, C#, Perl, and Ruby, as well as Adobe PostScript and Microsoft Rich Text Format. URL encoding is used in HTTP requests, including the URI, and in the message body (provided that the message's Content-Type header includes the string “application/x-www-form-urlencoded”). XML numeric character references are used in XML as well as in HTML (HyperText Markup Language), MathML (Mathematical Markup Language), and SGML (Standard Generalized Markup Language).
Problems arise when a null-byte is interpreted as an end-of-string sentinel in some phases of processing but not in others. Null-byte injection is often used by attackers to bypass sanity checks, which can result in service outages or lead to various business-logic exploits. For example, if a portion of a string before the first null-byte is a valid value for the variable being specified, an input validator interpreting the string as null-terminated will let it pass, ignoring any potentially malignant payload following that null-byte. If the space allocated for the string is determined by the apparent sentinel position of the null-byte, storing the entire string will cause a buffer overflow. If the null-byte is ignored when the string is evaluated, the text may yield an out-of-range value. And if the payload is interpreted or executed in a later processing phase, it can be used for a code-injection attack.
Improperly validated out-of-range values can cause service outages or yield business-logic exploits in a variety of specific ways. For example, an out-of-range value may cause a program to crash or otherwise behave unexpectedly. A negative monetary value may result in money being credited to an account instead of debited from it. A zero quantitative value may cause a program to crash from an attempted division by or logarithm of zero, or wreak havoc with an uncaught infinity or NaN (Not a Number).
Buffer overflows can also cause a variety of specific service outages or business-logic exploits. In the simplest case, a buffer overflow may cause a server application to crash. If the buffer is on the program stack, it may corrupt the stack, causing wildly unexpected behavior.
At worst, in the case of a code-injection attack on a von Neumann-architecture machine, the data beyond the null-byte may be interpreted as code, giving the attacker control over the application.
By employing different permutations (including repetitions) of a set of character encoding methods to disguise embedded null-bytes, an attacker can potentially smuggle a malicious text through various processing phases to attack targets deep inside a network. In particular, in an advanced-persistent-threat scenario in which an attacker has acquired sufficient intelligence to simulate the target network's operation in detail, a strategically encoded null-byte in a text string, introduced through a well-chosen vulnerable point of entry, could potentially guide an appropriately designed payload to any desired point in the network in order to hijack or otherwise affect any desired service at any desired time.