Regular expression search operations are employed in various applications including, for example, intrusion detection systems (IDS), virus protections, policy-based routing functions, internet and text search operations, document comparisons, and so on. A regular expression can simply be a word, a phrase or a string of characters. For example, a regular expression including the string “gauss” would match data containing gauss, gaussian, degauss, etc. More complex regular expressions include metacharacters that provide certain rules for performing the match. Some common metacharacters are the wildcard “.”, the alternation symbol “I′, and the character class symbol “[ ].” Regular expressions can also include quantifiers such as “*” to match 0 or more times, “+” to match 1 or more times, “?” to match 0 or 1 times, {n} to match exactly n times, {n,} to match at least n times, and {n,m} to match at least n times but no more than m times. For example, the regular expression “a.{2}b” will match any input string that includes the character “a” followed exactly 2 instances of any character followed by the character “b” including, for example, the input strings “abbb,” adgb,” “a7yb,” “aaab,” and so on.
While regular expressions are helpful in determining whether an input string matches a pattern, it can be difficult, or even impossible, to use regular expressions to identify input strings that do not match certain patterns. For example, access control lists (ACLs) are classification filters that enable network administrators to control the processing functions applied to incoming packets in packet-switched networks (e.g., to permit or deny application of a given feature to an incoming packet). Typically, an ACL is embodied by number of regular expressions that can be stored in a search engine. During processing of each packet in a data stream, a search key is constructed either from selected fields within the packet header (e.g., source address, destination address, source port, destination port, protocol, etc.) or from the packet payload (e.g., for deep content inspection operations), and then compared with the regular expressions stored in the search engine to determine what action is to be taken. More specifically, if the search key matches a policy statement (also referred to as an access control entry (ACE)) stored in the search engine, then the action corresponding to the matching entry is taken. Thus, because conventional search engines search for matching patterns, conventional search engines deployed in packet classification systems typically store a statement or entry for every combination of desired packet header field values associated with a particular action, which in turn consumes significant storage area. Accordingly, it would be desirable to reduce the amount of storage area required to implement search operations using regular expressions (e.g., for packet filtering and classification operations).
Like reference numerals refer to corresponding parts throughout the drawing figures.