The Open Systems Interconnection (OSI) Reference Model defines seven network protocol layers (L1-L7) used to communicate over a transmission medium. The upper layers (L4-L7) represent end-to-end communications and the lower layers (L1-L3) represent local communications.
Networking application aware systems need to process, filter and switch a range of L3 to L7 network protocol layers, for example, L7 network protocol layers such as, HyperText Transfer Protocol (HTTP) and Simple Mail Transfer Protocol (SMTP), and L4 network protocol layers such as Transmission Control Protocol (TCP). In addition to processing the network protocol layers, the networking application aware systems need to simultaneously secure these protocols with access and content based security through L4-L7 network protocol layers including Firewall, Virtual Private Network (VPN), Secure Sockets Layer (SSL), Intrusion Detection System (IDS), Internet Protocol Security (IPSec), Anti-Virus (AV) and Anti-Spam functionality at wire-speed.
Network processors are available for high-throughput L2 and L3 network protocol processing, that is, performing packet processing to forward packets at wire-speed. Typically, a general purpose processor is used to process L4-L7 network protocols that require more intelligent processing. Although a general purpose processor can perform the compute intensive tasks, it does not provide sufficient performance to process the data so that it can be forwarded at wire-speed.
Content aware networking requires inspection of the contents of packets at “wire speed.” The content may be analyzed to determine whether there has been a security breach or an intrusion. A large number of patterns and rules in the form of regular expressions are applied to ensure that all security breaches or intrusions are detected. A regular expression is a compact method for describing a pattern in a string of characters. The simplest pattern matched by a regular expression is a single character or string of characters, for example, ‘c’ or ‘cat’. The regular expression also includes operators and meta-characters that have a special meaning.
Through the use of meta-characters, the regular expression can be used for more complicated searches such as, ‘abc.*xyz’. That is, find the string ‘abc’, followed by the string ‘xyz’, with an unlimited number of characters in-between ‘abc’ and ‘xyz’. Another example is the regular expression ‘abc . . . abc.*xyz’; that is, find the string ‘abc’, followed two characters later by the string ‘abc’ and an unlimited number of characters later by the string ‘xyz’.
An Intrusion Detection System (IDS) application inspects the contents of all individual packets flowing through a network, and identifies suspicious patterns that may indicate an attempt to break into or compromise a system. One example of a suspicious pattern may be a particular text string in a packet followed 100 characters later by another particular text string.
Some IDS applications generate lots of false positives, that is, the applications detect an attack when there is none. Others miss attacks because simple pattern matching of signatures is often insufficient and the application cannot handle the amount of data to be analyzed.
Content searching is typically performed using a search algorithm such as, Deterministic Finite Automata (DFA) to process the regular expression. The DFA processes an input stream of characters sequentially using a DFA graph and makes a state transition based on the current character and state. The greater the number of wildcard characters in the regular expression, the more unmanageable the DFA graph becomes.