1. Field of the Invention
The present invention relates to the field of packet scanning and more particularly, to a method and a device for distributing patterns to scanning engines for scanning packets in a packet stream.
2. Description of the Related Art
Packet scanning, also known as packet content scanning, is an important part of network security and application monitoring. Packets in a stream are mapped against a set of patterns to detect security threats or to gain information about the stream or packet stream. Due to their flexibility, regular expressions are a common way to define such patterns. Finite automata are typically used to implement regular expression scanning or parsing.
In contrast to NFA (Non-Deterministic Finite Automata), DFA (Deterministic Finite Automata) only require one state transition per input value. This yields higher scanning or parsing rates and a smaller parse state which has to be maintained per flow. Therefore, DFA are preferred for Network Intrusion Detection Systems (NIDS) although they usually require more memory than NFA.
Regarding NIDSs, the frequency of network attacks increases every year, and the methods of attack are becoming more sophisticated, and NIDS keep up with these trends. An example of an NIDS is known from “SNORT network institution detection systems”, http://www.snort.org, referenced as [1]. Such NIDS apply very powerful and flexible content-filtering rules defined using regular expressions. This has triggered a substantial amount of research and product development in the area of hardware-based accelerators for pattern matching, as this seems to be the only viable approach for scanning network data against the increasingly complex regular expressions at wire-speed processing rates of tens of gigabits per second.
Moreover, in typical network environments, the number of open sessions at any given time can be on the order of millions, and the streams are scanned in an interleaved fashion. Therefore, the internal state of the scanning engine needs to be stored and reloaded whenever the input stream is switched.
To reach higher throughput for the complex sets of expressions, a compact representation of the data structures describing the automata is required, so that it can be kept in fast on-chip memories.
If the data structures become too large, so that an off-chip memory needs to be used, the higher latency of such memories limits the rate at which the input stream can be processed. In this regard, S. Kumar, B. Chandrasekaran, J. Turner, and G. Varghese, “Curing regular expressions matching algorithms from insomnia, amnesia, and acalculia”, in ANCS '07, pp. 155-164, ACM, 2007, referenced as [2], show that the size of the data structures can grow exponentially if certain regular expressions are combined into one scanning engine.
Accordingly, an embodiment of the present invention provides a memory-efficient distribution of patterns to scanning engines.