Signature-based intrusion detection systems (IDSs) or intrusion prevention systems (IPSs), such as Snort®, Bro, Cisco Security Appliance, Citrix® Application Firewall, and the like, protect a network by examining headers and content of all packets entering or leaving the network. Such systems raise alerts and/or drop packets upon detecting suspicious headers or payloads. In general, a suspicious packet is detected by matching the packet against a database of rules, where each rule represents a particular signature/pattern of a security exploit.
To represent security exploits as accurately and precisely as possible, the IDS/IPS rule syntax should be sufficiently powerful. Otherwise, a large number of good packets may be incorrectly marked as harmful, or harmful packets may go undetected. Moreover, the packet processing rate should keep up with high line speeds without dropping packets or allowing bad packets through. These two goals often conflict because of the direct relationship between the expressiveness and complexity of the rule language and the packet processing time. One of the common IDS/IPS rule syntaxes is regular expressions. However, unless the rules are written with care and the underlying pattern matching is implemented carefully, processing of a packet may take a long time to complete. The resulting performance vulnerability may be exploited by an attacker to generate a low-bandwidth denial-of-service (DoS) attack on the IDS/IPS itself. Such vulnerability can typically be traced back to the backtracking-based pattern matching of regular expressions.
When only regular expressions define the rules, this vulnerability may be avoided, for example, by using deterministic matching process. However, increasingly, the rules are written using an extension of regular expressions with back-references, known as the full regex syntax, to which known deterministic matching processes cannot be applied. The backtracking-based pattern matching is generally the only option available for matching regular expressions with back-references (backref-regexes). Accordingly, embedding such backref-regexes expressions with regular expressions that cause the backtracking algorithms to exhibit exponential behavior may cause serious performance vulnerability. Further, unrestricted use of regular expressions, and regular expressions with back-references in particular, makes it challenging to predict worst-case performance, and therefore, to guard in advance against performance attacks.
To minimize the performance vulnerability, various guidelines have been offered, including avoiding back-references and/or backtracking, using a memory-efficient deterministic algorithm for regular expressions, using only well-tested regular expressions, avoiding known patterns that incur exponential behavior, and limiting time and memory requirements of the matching phase. However, such guidelines are often inapplicable in the context of the IDSs/IPSs. In particular, as new security exploits appear, IDS/ISP patterns are continually added and updated, primarily by network managers or security professionals, who are not cognizant of the underlying pattern matching processes. Further, some security exploits, e.g., buffer overflow attacks, are most accurately and precisely expressed with complex syntax like full regex. Limiting time or memory may result in failure to detect bad packets or dropping of harmless packets. Therefore limiting the IDS rule syntax, enforcing time or memory restrictions, and/or relying on the prudence of the rule writers is not appropriate for IDS/IPS applications.