In computerized systems and networks the control of data and packet flow between users in the network and the outside world often provides a first line of defense against malicious nodes, and provides a measure of administrative control. Gateways, firewalls, proxies, and network address translators (NAT) etc. are all devices which allow a network administrator to institute various policies and rules that, at least to some extent, provide the control of data and packet flow therethrough. The rules that reflect the administrative policy are typically associated with a packet's source address, destination address, port information, etc., and can designate applicability to groups of addresses to allow communication policy to be applied to nodes within a subnet.
Once the administrator of the policy builds the set of rules or regular expressions to be applied to the packet flow through the network device, these rules are typically placed in a database for retrieval and use by the device. As each packet is processed through the gateway, firewall, etc., the device must check the database to find any and all applicable rules that apply to that packet. Such rules may simply relate to the proper routing of that packet, or may be restrictive in nature, disallowing packet flow between certain addresses or from certain nodes. Once the device has found the applicable rules, these rules are then applied to decide what to do with the packet.
As the size of the network grows and the throughput of information increases, the processing time required to locate and apply these rules may result in information flow delay caused by the gateway device. This lag in processing time may become unacceptably long, especially as the number of rules and the volume of communication traffic increases. In the firewall example, a major component of this processing lag is the search time associated with finding which rules apply to particular IP addresses. While the current IP address system (the dotted quad) provides for 232 unique IP addresses, the search delay component associated with IP addresses will only increase as the IPv6 standard substantially increases this number. If the rule is to take both the source address and the destination address into account, then the number of possible combinations rises to 264.
Therefore, there exists a need in the art for a method for efficiently searching a data set of fixed length words to find a set of matching regular expressions.