Deep content inspection of network packets is driven, in large part, by the need for high performance quality-of-service (QoS) and signature-based security systems. Typically QoS systems are configured to implement intelligent management and deliver content-based services which, in turn, involve high-speed inspection of packet payloads. Likewise, signature-based security services, such as intrusion detection, virus scanning, content identification, network surveillance, spam filtering, etc., involve high-speed pattern matching on network data.
The signature databases used by these services are updated on a regular basis, such as when new viruses are found, or when operating system vulnerabilities are detected. This means that the device performing the pattern matching must be programmable.
As network speeds increase, QoS and signature-based security services are finding it increasingly more challenging to keep up with the demands of the matching packet content. The services therefore sacrifice content delivery or network security by being required to miss packets. Currently, fast programmable pattern matching machines are implemented using finite state machines (FSM).
FIGS. 1A and 1B respectively show state transition diagrams 100 and state transition tables 110 of a finite state machine (FSM) adapted to perform the following Regular Expression:.*[1–9][0–9]*@[1–9][0–9]*(.|-)COM.*  (1)
For purposes of simplicity, it is assumed that only the sixteen symbols used in expression (1) are defined. It is understood that expression (1) may include a string containing any of the digits 1–9, followed by any of the digits 0–9, followed by the “@” symbol; followed by any of the digits 1–9, followed by any of the digits 0–9; followed by either a single period (.) or hyphen (-), followed by the letters “COM”. Examples of strings that match the expression are shown below:12345@6789-COMCOM10@89.COMExamples of strings that do not match the expression are shown below:1234567890@0.COM
Many of the state transitions, particularly those that transition back to the start state are omitted from the state transition diagram 100 for simplicity. State transition diagram is a deterministic finite state automata (DFA). Table 110 lists the current state along the rows, and the current input symbols along the columns. Each entry in table 100 defines the state to which transition is made to given the combined current state and current input symbol.
There are two types of FSMs. In a Moore FSM, shown in FIG. 2A, the input symbol and current state are received by a logic block 200 which is configured to generate the next state; this next state is saved in a register 210. Register 210 is clocked every time a new input symbol arrives. The output symbol is generated by an output logic block. The following pseudo-code shows that the output of a Moore FSM is determined by the current state of the FSM:MOORE_OUTPUT=OUTPUT_TABLE[CURRENT_STATE]
In a Mealy FSM, shown in FIG. 2B, the input symbol and current state are received by logic block 250 which is configured to generate the next state. The next logic state together with the received input symbol define the output symbol. The following pseudo-code shows that the output of a Mealy FSM is determined by the current state of the FSM together with the received input symbol:MEALY_OUTPUT=OUTPUT_TABLE[CURRENT_STATE][INPUT_SYMBOL]
FIG. 3 is a simplified high-level block diagram of a conventional programmable Moore FSM 350. The transition table for FSM 350 is stored in a transition table memory 300 and is indexed by the current state and input symbol. This memory is clocked after the receipt of each new input symbol. The output is read from an output look-up table memory 310 indexed by the current state. FSM implementation 350 is flexible in that it is programmable and can implement state transitions at relatively high-throughput. However as the number of data related to the states, input symbols and transitions become large, the amount of memory needed to store this data grows exponentially. For an n-bit state vector and k-bit symbol, FSM 350 requires 2n+k memory locations for the transition table 300, and 2n memory locations for output look-up table 310.
As is known, the process of mapping a regular expression, such as expression (1) shown above, or signature database, to a FSM involves compiling the expression into a non-deterministic finite-state automaton (NFA), and then converting the NFA to a deterministic finite-state automaton (DFA).
In addition to pattern matching through regular expressions, FSMs also have applications in protocol specification, implementation and validation such as TCP/IP, expert systems and machine learning where knowledge is expressed as decision trees or stored in directed graphs, formal modeling, and image processing.
An FSM typically starts in a given initial state, usually state zero. On receipt of each input symbol, the FSM advances to a new state determined by the current state, together with the input symbol. This operation is referred to as calculating the “next state” or “transition function” of the finite state machine. The calculation of the next state is often performed through a table lookup. The table (see FIG. 1B), known as the “transition table”, is arranged so as having the row number determined by the current state and the column number by the current input symbol. Each entry in the transition table contains the value for the next state given that current state, as defined by the row, and the input symbol, as defined by the column. The transition table is commonly stored using a RAM lookup table, as shown in FIG. 3. Data symbols received from a digital network are usually encoded as 8-bit bytes, and the number of states is determined by the complexity of the given application. The following pseudo-code illustrates the FSM operation:
CURRENT_STATE = 0for each INPUT_SYMBOL,  NEXT_STATE =TRANSITION_TABLE[CURRENT_STATE]  [INPUT_SYMBOL]CURRENT_STATE = NEXT_STATEnext INPUT_SYMBOL
Programmable FSMs are often expensive because of the size of the memory required to store the transition table. This problem is even more pronounced for fast FSMs which are required to compute the next state within a few and fixed number of clock cycles. For example, the state machine implementation shown in FIG. 3, having n-bit state vector and k-bit symbols, requires 2n+k entries of n-bit words, or 2n+k×n bits, for storing the full transition table. Additional memory is required for the output look-up table. For example, for an application servicing 1 Gbps network traffic, the FSM is required to compute the next state every 8 ns, for 8-bit input symbols.
U.S. Pat. No. 6,167,047 describes a technique in which memory optimization is achieved through usage of stack memory allowing the state machine to repeat common sub-expressions while calculating the next state within a single clock cycle. This technique uses a large memory, and therefore limits the complexity of the FSM. This technique also suffers from the problem that the stack memory is limited.
U.S. Patent application No. 2003/0051043 describes a technique for performing regular expression pattern matching using parallel execution of real-time deterministic finite state automata (RDFA). The technique involves processing data at higher speeds by combining a number of bytes into an aggregate symbol. This technique involves creation of n-closures which increase the size of the FSM as the number of potential transitions per state increases exponentially.