Communications networks, such as the Internet, are frequently the objects of sophisticated attacks by unauthorized intruders seeking to cause harm to the computers of unsuspecting users. For example, worms and viruses are well known causes for security breaches in computer systems. These constitute malicious data sent to a service or an application that exploits a vulnerability (such as a buffer overflow providing root access to the worm's executable program) that causes the service or application to be disabled, crash, or provide unauthorized privileges to an attacker.
Other attacks or computer security vulnerabilities include web layer code injections, such as cross-site scripting (XSS) attacks, PHP local/remote file inclusion (L/RFI) attacks, and Structured Query Language (SQL) injection attacks. These web layer code injection attacks are being used to target web applications and take advantage of programming flaws to manipulate the program's behavior, thereby allowing the attacker to manipulate code and data on the target. While the server is the victim of the code injection, the targets often include the viewers or users that access that server as well. Compromised websites often discover embedded malicious code that redirects their viewers to malicious destinations, where such viewers are exposed to further exploits. For example, it has been estimated that over sixty percent of websites have a critical security flaw or vulnerability, where about sixty-three percent of websites have a XSS vulnerability and about seventeen percent of websites are likely to include a SQL injection attack. In addition, it has also been estimated that there is an average of seven unfixed vulnerabilities in a given website.
Existing intrusion detection approaches typically fall into two categories: detecting known malicious code and detecting legitimate input. In general, detection approaches that rely on signatures, such as Snort, are effective at filtering out known exploits, but cannot enable a defense against previously unseen attacks. Moreover, in a web environment, where hundreds of thousands of unique attacks are generated each day and polymorphism is common, the usefulness of signature-based detection approaches is limited. On the other hand, anomaly detection approaches suffer because they are limited to network layer, protocol-agnostic modeling which are constrained in scope and vulnerable to packet fragmentation and blending attacks. Unlike shellcode and worm traffic, web layer code injections use higher level interpreted code and do not require corruption of the server's control flow at the memory layer. Web layer exploits are smaller, more dynamic, and far less complex than shellcode, thereby making them both easier to create and disguise. Anomaly-based classifiers can recognize new behavior, but are often unable to distinguish between previously unseen good behavior and previously unseen bad behavior. This results in a high false positive rate, even with extensively trained classifiers.
Accordingly, it is desirable to provide systems, methods, and media for detecting network anomalies that overcome these and other deficiencies of the prior art.