Intrusion detection systems (IDS) monitor events within a network or computer system with the objective of detecting unwanted manipulations, or intrusions, of computer systems defined by the National. Institute of Standards and Technology in their Special Publication on IDSs as “attempts to compromise the confidentiality, integrity, availability, or to bypass the security mechanisms of a computer or network.” The intrusion detected by an IDS may manifest itself as, for example, a denial of service, unauthorized login, a user performing tasks that he/she is not authorized to do (e.g., access secure files, create new accounts, etc), or execution of malware such as viruses and worms.
Intrusion detection is the process of monitoring the events occurring in a computing system or network and analyzing them for signs of possible incidents, which are violations or imminent threats of violation of computer security policies, acceptable use policies, or standard security practices. Although many incidents are malicious in nature, many others are not; for example, a person might mistype the address of a computer and accidentally attempt to connect to a different system without authorization.
An IDS typically takes the form of software or hardware products that automate the intrusion detection process. An IDS accomplishes its objective by analyzing data gathered from the network, host computer, or application that is being monitored. The analysis usually takes one of two forms—misuse (or signature) detection and anomaly detection. In misuse detection, the IDS maintains a database of signatures (patterns of events) that correspond to known attacks and searches the gathered data for these signatures. In anomaly detection the IDS maintains statistics that describe normal usage and checks for deviations from these statistics in the monitored data. While misuse detection usually has a low rate of false positives, it is able to detect only known attacks. Anomaly detection usually has a higher rate of false positives (because users keep changing their usage pattern thereby invalidating the stored statistics) but is able to detect new attacks never seen before.
Several types of IDSs are available commercially, such as, for example, network, host, application, protocol, and hybrid IDSs. Network intrusion detection systems (NIDS) examine network traffic (both in- and out-bound packets) looking for traffic patterns that indicate attempts to break into a target computer, port scans, denial of service attacks, and other malicious behavior. Host intrusion detection systems (HIDS) monitor the activity within a computing system looking for activity that violates the computing systems internal security policy (e.g., a program attempting to access an unauthorized resource). Application intrusion detection systems (AIDS) monitor the activity of a specific application while protocol intrusion detection systems (PIDS) ensure that specific protocols such as HTTP behave as they should. Each type of IDS has its capabilities and limitations and attempts have been made to put together hybrid IDSs that combine the capabilities of the described base IDSs.
The development of high-speed intrusion detection systems and components has been the focus of significant recent research. Although there are many components in a NIDS that should be optimized to achieve line-rate processing, the string matching component, which is one of the most time consuming components, has been the focus of much of the prior work on NIDS optimization. String matching requires the examination of the network traffic to determine all matches with the strings in the string database. Although pre-filtering reduces the effective workload on the NIDS, there remains a need for powerful and compact data structures for string matching.
Bro, lead by Vern Paxson, and Snort, lead by Martin Roesch, are two of the more popular public-domain NIDSs that incorporate pre-filtering. Both are software solutions to intrusion detection. In addition, both maintain a database of signatures (or rules) that include a string as a component. These intrusion detection systems examine the payload of each packet that is matched by a rule and reports all occurrences of the string associated with that rule. It is estimated that about 70% of the time it takes Snort, for example, to process packets is spent in its string matching code and this code accounts for about 80% of the instructions executed (see Anonatos et al., “Generating realistic workloads for network intrusion detection systems,” ACM Workshop on Software and Performance, 2004). Consequently, much research has been done recently to improve the efficiency of string matching.
The current implementation of Snort uses an optimized version of the Aho-Corasick automaton provided by A. Aho and M. Corasick in “Efficient string matching: An aid to bibliographic search,” CACM, 18, 6, 1975, 333-340, which is hereby incorporated by reference in its entirety. Snort also uses SFK search, which is the algorithm used for low memory situations, and the Wu-Manber multi-string search algorithm, which is described in “Agrep—a fast algorithm for multi-pattern searching,” Technical Report, Department of Computer Science, University of Arizona (1994) by S. Wu and U. Manber.
The memory required to store the optimized Aho-Corasick and Wu-Manber data structures can be excessive. To reduce the memory requirement of the Aho-Corasick automaton, Tuck et al., in “Deterministic memory efficient string matching algorithms for intrusion detection,” INFOCOM (2004), have proposed starting with the unoptimized Aho-Corasick automaton and using bitmaps and path compression. With these compression methods, Tuck et al. found that the memory required by the compressed unoptimized Aho-Corasick automaton becomes about 1/50 to 1/30 of that required by the optimized automaton and the Wu-Manber structure and is slightly less than that required by SFK search. However, a search requires a large number of additions to be performed at each node and so requires hardware support for efficient implementation. String matching using a purely software implementation of the bitmap and path-compressed Aho-Corasick automaton takes about 10% to 20% more time, on average, than when an optimized Aho-Corasick automaton is used. Hardware and hardware assisted solutions also have been proposed involving the use of TCAMs (ternary content addressable memories) and/or FPGAs (field programmable gate arrays).
The Aho-Corasick automaton for multi-string matching is widely used in IDSs. The method of Aho-Corasick involves constructing a state machine for pattern matching and then using the pattern matching state machine to process a text string in a single pass. There are two versions of this automaton—unoptimized and optimized. While both versions are finite state machines, the unoptimized version has a failure pointer for each state, while in the optimized version no state has a failure pointer. In both versions, each state has success pointers and each success pointer has a label, which is a character from the string alphabet, associated with it. Also, each state has a list of strings/rules (from the string database) that are matched when that state is reached by following a success pointer. This is the list of matched rules. In the unoptimized version, the search starts with the automaton start state designated as the current state and the first character in the text string, S, that is being searched designated as the current character. At each step, a state transition is made by examining the current character of S. If the current state has a success pointer labeled by the current character, a transition to the state pointed at by this success pointer is made and the next character of S becomes the current character. When there is no corresponding success pointer, a transition to the state pointed at by the failure pointer is made and the current character is not changed. Whenever a state is reached by following a success pointer, the rules in the list of matched rules for the reached state are output along with the position in S of the current character. This output is sufficient to identify all occurrences, in S, of all database strings. Aho and Corasick have shown in their paper entitled “Efficient string matching: An aid to bibliographic search,” that when their unoptimized automaton is used, the number of state transitions is 2n, where n is the length of S.
In the optimized version, each state has a success pointer for every character in the alphabet and so, there is no failure pointer. Aho and Corasick show how to compute the success pointer for pairs of states and characters for which there is no success pointer in the unoptimized automaton thereby transforming an unoptimized automaton into an optimized one. The number of state transitions made by an optimized automaton when searching for matches in a string of length n is n.
FIG. 1 shows an example string set drawn from the 3-letter alphabet {a, b, c}. FIG. 2 shows its unoptimized Aho-Corasick automaton, and FIG. 3 shows its optimized Aho-Corasick automaton. For this example, it can be assumed that the string alphabet is {A, B, C}.
When the failure pointers are removed from an uncompressed Aho-Corasick automaton, the resulting structure is a trie rooted at the automaton start node. However, an optimized automaton has the structure of a graph that may not be a trie. This difference in the structure defined by the success pointers has an impact on the ability to compress unoptimized automata versus optimized automata.
Tuck et al. provide a method to compress non-optimized automaton. To understand their method, an example is provided assuming that the alphabet size is 256 (e.g., ASCII characters). Although the development is generalized readily to any alphabet size, it is convenient to do the development using a fixed and realistic alphabet size. A natural way to store the Aho-Corasick automaton, for a given database D of strings, in a computer is to represent each state of the unoptimized automaton by a node that has the following fields:
1. Success[0:255], where Success[i] gives the state to transition to when the ASCII code for the current character is i (Success[i] is null in case there is no success pointer for the current state when the current character is i).
2. RuleList: a list of rules that are matched when this state is reached via a success pointer.
3. Failure: the transition to make when there is no success transition, for the current character, from the current state.
For this example, assume that each pointer requires 4 bytes. So, each node requires 1024 bytes for the Success array and 4 bytes for the failure pointer. In keeping with Tuck et al., when accounting for the memory required for RuleList, it can be assumed that only a 4-byte pointer to this list is stored in the node and the memory required by the list itself can be ignored. Hence, the size of a state node for an unoptimized automaton is 1032 bytes. In the optimized version, the Failure field is omitted and the memory required by a node is 1028 bytes. While each node of the optimized automaton requires 4 bytes less than required by each node of the unoptimized automaton, there is little opportunity to compress an optimized node as each of its 256 success pointers is non-null and the automaton does not have a tree structure. However, many of the success pointers in the nodes of an unoptimized automaton are null and the structure defined by the success pointers is a trie. Therefore, there is an opportunity to compress these nodes. Following up on this observation, Tuck et al. proposed two transformations, bitmap compression and path compression, to compress the nodes in an unoptimized automaton:
1. Bitmap Compression. In its simplest form, bitmap compression replaces each 1032-byte node of an unoptimized automaton with a 44-byte node. Of these 44 bytes, 8 are used for the failure and rule list pointers. Another 32 bytes are used to maintain a 256-bit bitmap with the property that bit i of this map is 1 if and only if Success[i]≠null. The nodes corresponding to the non-null success pointers are stored in contiguous memory and a pointer (firstChild) to the first of these stored in the 44-byte node. To make a state transition when the ASCII code for the current character is i, it is first determined whether Success[i] is null by examining bit i of the map. In case this bit is null, the failure pointer is used. When this bit is not null, the number of bits (popcount or rank) in bitmap positions less than i that are 1 is determined, and then using this count, the size of a node (44-bytes), and the value of the first child pointer, the location of the node to transition to is determined. Since determining the popcount involves examining up to 255 bits, this operation is quite expensive (at least in software). To reduce the cost of determining the popcount, Tuck et al. propose the use of summaries that give the popcount for the first 32*j, 1≦j<8 bits of the bitmap. Using these summaries the popcount for any i may be determined by adding together a summary popcount and up to 31 bit values. Each summary needs to be 8 bits long (the maximum value is 255) and 7 summaries are needed. The size of a bit compressed node with summaries is, therefore, 51 bytes. FIG. 4 shows a bitmap node. As shown in FIG. 4, the size of a bitmap node becomes 52 bytes when the node type and failure pointer offset fields that are needed to support path compression are included.
2. Path Compression. Path compression is similar to end-node optimization (see Eatherton et al., “Tree bitmap: hardware/software IP lookups with incremental updates,” Computer Communication Review, 34(2): 97-122, 2004 and W. Lu and S. Sahni, “Succinct representation of static packet classifiers,” IEEE Symposium on Computers and Communications, 2007). An end-node sequence is a sequence of states at the bottom of the automaton (the start state is at the top of the automaton) that are comprised of states that have a single non-null success transition (except the last state in the sequence, which has no non-null success transition). States in the same end-node sequence are packed together into one or more path compressed nodes. The number of these states that may be packed into a compressed node is limited by the capacity of a path compressed node. So, for example, if there is an end-node sequence s1, s2, . . . , s6 and if the capacity of a path compressed node is 4 states, then s1, . . . , s4 are packed into one node (for example A) and s5 and s6 into another (for example B). For each si packed into a path compressed node in this way, the 1-byte character for the transition plus the failure and rule list pointers for si need to be stored. Since several automaton states are packed into a single compressed node, a 4-byte failure pointer that points to a compressed node is not sufficient. In addition, an offset value is needed that indicates which state within the compressed node to be transitioned to. Using 3 bits for the offset, nodes with capacity c≦8 can be handled. Note that now, ┌3c/8┐ bytes are needed for the offsets. Hence, a path compressed node whose capacity is c≦8 needs 9c+┌3c/8┐ bytes for the state information. Another 4 bytes are needed for a pointer to the next node (if any) in the sequence of path compressed nodes (i.e., a pointer from A to B). An additional byte is required to identify the node type (bitmap and compressed) and the size (number of states packed into this compressed node). So, the size of a compressed node is 9c+┌3c/8┐+5 bytes. Accordingly, the node type bit and an offset for the failure pointer are now required in the bitmap nodes. Accounting for these fields, the size of a bitmap node becomes 52 bytes. Since a compressed node may be a sibling (states/nodes reachable by following a single success pointer from any given state/node are siblings) of a bitmap node, the sizes of both bitmap and path compressed nodes need to be kept the same so that the jth child of a bitmap node can be easily accessed by performing arithmetic on the first child pointer. This requirement creates a limitation of c=5 and a path compressed node size that is 52 bytes. FIG. 5 shows a path compressed node.
On the 1533-string Snort database of 2003, the memory required by the bitmapped-path compressed automaton using one level of summaries is about 1/50 that required by the optimized automaton, about 1/27 that required by the Wu-Manber data structure, and about 10% less than that required by the SFK search data structure. However, the average search time, using a software implementation, is increased by between 10% and 20% relative to that for the optimized automaton, by between 30% and 100% relative to the Wu-Manber algorithm, and is about the same as for SFK search. According to Tuck et al., the real payoff from the Aho-Corasick automaton comes with respect to worst-case search time. The worst-case search time using the Aho-Corasick automaton is between ¼ and ⅓ that when the Wu-Manber or SFK search algorithms are used. The worst-case search time for the bitmapped-path compressed unoptimized automaton is between 50% and 100% more than for the optimized automaton.
Accordingly, there continues to be a need in the art for improvements to the storage and search cost of NIDS string matching using the Aho-Corasick automaton.