This invention relates to an approach to efficient and predictable-time string search and matching, and in particular is related to use of such an approach for gate functionality, such in a firewall device of a data network.
Fast string search and matching is critical for many security applications in particular if these have “gate functionality” like firewalls, access control applications and load balancers. In the description below, references to “firewalls” should be understood to be generally applicable to a wide range of gate functionality devices (also referred to below as a “gate device”), and not limited by the use of the word “firewall.” Fast matching is essential to impose and enforce access policies without creating bottlenecks. Firewalls in particular protect networks by monitoring the traffic crossing the network perimeter. The number of packet matching rules a firewall can effectively handle is limited by the matching time and space complexity of the algorithms employed. In principle, the more specific the rules can be made the more fine-grained policies can be enforced to enhance security. In practice rules-bases of 10,000 rules are considered as large for firewalls. Through the use of wildcards and overlapping rules—to realize intersections and difference operations on sets of IP addresses or port numbers—network administrators therefore attempt to be acceptably specific while limiting the number of rules.
A firewall essentially acts as an access filter between the inside and the outside of almost any organization's networking systems. The filtering should be as specific as possible to maintain high level of security and to provide high bandwidth to avoid becoming a bottleneck and thus leading to unusable and unstable networking service. In many implementations, each data packet arriving at a firewall is inspected, and various fields are extracted and checked against a set of predefined patterns. For example, the source IP address of a packet may be compared to a list of known addresses, each of which may be associated with a particular security policy. The list of known values for a field for which policies are defined can be represented as a chain lookup structure, in which the key of each chain is the known pattern (e.g., IP address), and the value of chain represents the policy associated with that pattern.
In general the look-up can involve chains or sequences of objects (like elements of one or more alphabets, symbols, numbers, words, images etc.) which are pairwise linked in an key-value scheme (in each pair a key is linked to a value) a sequence of these pairs form a chain.
Security policies are not always static, with new rules being inserted or deleted from time to time. Adding or deleting a rule generally corresponds to modification of a chain lookup structure of patterns by adding or deleting a pattern.
Efficient implementation of a firewall can be enabled by a implementation of a chain search that provides efficient and predictable time lookup of a (key,value) pair, and preferably that can be inserted or deleted from the chain lookup structure, and a value can be looked up for that value.
Access control applications such as firewalls are central to Internet and network security. As an access filter between internal networks and outside networks like the Internet a firewall should apply selection criteria that are as specific as possible while providing high bandwidth. In practice both is difficult to achieve at the same time. In most modern firewalls the rule base is a list and the number of active rules N is carefully observed and limited as the search time increases linearly with N. If N becomes too large the bandwidth drops and the firewall becomes a bottleneck leading to slow and unstable network service. Also the firewall's vulnerability to denial of service attacks increases if N becomes large. The O(N) matching time complexity therefore effectively sets a limit on the size of firewall rule bases. The N-dependence of the space complexity has to be considered as well and is subject to optimization efforts but the more stringent limit is set by the time complexity.
The time complexity is reduced by sorting the rules in the firewall rule base so that the most frequently used rules are listed first. Further most state-of-the-art firewalls apply stateful firewall matching which ensures that if a first packet has been allowed all further packets belonging to the same flow are permitted to cross as well. Usually the state lookup algorithm is significantly faster than the packet matching one. Stateful firewalls are therefore particularly well suited for long packet flows but are less efficient if confronted with a large number of short flows like in SYN-flood denial of service attacks.
Firewall rules can overlap i.e. more than one rule could be matched to a packet. Most modern firewalls apply “first match” semantics i.e. the policy associated with the first matching rule in the list is applied to a packet. To reduce the number of firewall rules and to maintain an acceptable level of specificity this can be exploited by using wildcards and overlapping rules to realize intersections and difference operations on sets of IP addresses or port numbers.