Digital data communication networks (i.e., packet-switched networks) are ubiquitous, and continue to increase in size and speed. For a variety of reasons, including load balancing, security, and the like, deep packet inspection (DPI) is necessary. DPI involves searching not only packet headers, but the payloads of packets, for known data patterns (e.g., “fingerprints” or signatures of known malware, such as viruses). Due to the increasing speed of network communications, and the need to inspect a large portion of, if not all, data packets, software-based DPI is not efficient enough to satisfy the bandwidth requirements. Furthermore, due to the variety and complexity of DPI-targeted data, conventional alphanumeric string comparison is insufficient.
Regular expressions (regex), popularized in UNIX utilities (e.g., ed, grep) and scripting programming languages (e.g., AWK, Perl), provide a powerful, compact, and very flexible means to match strings of text, including particular characters, words, or patterns of characters. For example, a regex engine would match the regular expression “log” to all of: log, bologna, logarithm, and analog. Regular expressions may include logical operators (i.e., OR), wildcards, repetition specifiers, and the like. The syntax of regular expressions is well known, and documented in numerous texts in the computing arts. See, e.g., Hoperoft, et al., Introduction to Automata Theory, Languages, and Computation, Addison-Wesley; Michael Sipser, Introduction to the Theory of Computation, Chapter 1: Regular Languages, PWS Publishing (ISBN 0-534-94728-X); Tony Stubblebine, Regular Expression Pocket Reference (2003), O'Reilly (ISBN 0-596-00415-X); Goyvaerts, et al., Regular Expressions Cookbook (2009), O'Reilly (ISBN 9778-0596520687).
An architecture for implementing a regex engine in hardware, that is able to perform DPI at wire speeds, for multiple expressions in parallel, while making efficient use of resources such as memory bandwidth, stands as a challenge of modern digital data communication networks.
The Background section of this document is provided to place embodiments of the present invention in technological and operational context, to assist those of skill in the art in understanding their scope and utility. Unless explicitly identified as such, no statement herein is admitted to be prior art merely by its inclusion in the Background section.