In data networks, routers classify data packets to determine the micro-flows that the packets belong to and then apply the classification to the packets accordingly. Flow identification is the essential first step for providing any flow dependent service. A number of network services require packet classification including access-control, firewalls, policy-based routing, provision of integrated/differentiated qualities of service, traffic billing, and secure tunnelling. In each application, the classifier determines which micro-flow an arriving packet belongs to so as to determine whether to forward or filter, where to forward it to, what class of service it should receive, the scheduling tag/state/parameter that it is associated with, or how much should be charged for transporting it. The classifier maintains a set of rules about packet headers for flow classification.
To clarify, a router is multi-port network device that can receive and transmit data packets from/to each port simultaneously. Data packets typically have a regular format with a uniform header structure. The header structure usually contains data fields such as address, or packet type. When a packet is received from a port, the router uses the header information to determine whether a packet is discarded, logged, or forwarded. If a packet is forwarded, then the router also calculates which output port the packet will be going to. The router also accounts for the number of each type of packet passing by. The forwarding decision (where to send the packet) is typically made based on the destination address carried in the packet. In an Internet Protocol Router, forwarding involves a lookup process called the Longest Prefix Match (LPM) that is a special case of the general mask matching process.
The LPM uses a route table that maps a prefix rule (a mask-matching rule with all the wildcard bits located at the contiguous least significant bits) to an output port ID. An example of an LPM route table is given below:
Output#32-bit PrefixPortID1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxport 02111100101100xxxxxxxxxxxxxxxxxxxxport 13110100110001xxxxxxxxxxxxxxxxxxxxport 3411110010110011000011xxxxxxxxxxxxport 2500100000000111111111000011010000port 4where x is a wild card bit.
An input packet with destination address=“1111 0010 1100 1100 0011 1111 1111 1111” should be forwarded to port 2 because it matches entry #2 and #4, but #4 has priority over #2 because the prefix length (number of non-wildcard bits) of #4 is longer than #2.
The router or a firewall will also examine the input packets to determine if they should be discarded and logged. This is usually done with an Access Control List (ACL) Lookup. An ACL can be a user configurable mask-matching rule set (based on packet header fields) that categorizes certain types of traffic that may be hazardous to the network. Hence, when a packet that matches an ACL entry is received, the router/firewall should take action according to the ACL to discard and log the packet or alarm the network administrator.
Such devices as explained above use general multi-layer classification methods in carrying out the device's function. General multi-layer classification requires the examination of arbitrary data/control fields in multiple protocol layers. The prior art grammatical/lexical parser provides flexible solutions to this problem, but the cost of supporting a large rule set is high.
A multiple field classifier is a simple form of classifier that relies on a number of fixed fields in a packet header. A classic example is the 7-dimensional classification, which examines the SA/DA/TOS/Protocol in the IP header, and the SPORT/DPORT/PROTOCOL_FLAG in the TCP/UDP header. Because a multi-field classifier deals with fixed fields, parsing is not required. Instead of dealing with variable length packets, the multi-field classifier does classification on fixed sized search keys. The search key is a data structure of the extracted packet data fields. The Multi-field classifier assumes the search keys are extracted from the packet before being presented to the classifier.
The problem of multiple field classification can be transformed into the problem of condition matching in multi-dimensional search key space, where each dimension represents one of the data fields the classifier needs to examine. A classification rule specifies conditions to be matched in all dimensions.
The classification rules specify value requirements on several fixed common data fields. Previous study shows that a majority of existing applications require up to 8 fields to be specified: source/destination Network-layer address (32-bit for Ipv4), source/destination Transport-layer port numbers (16-bit for TCP and UDP), Type-of-service (TOS) field (8-bits), Protocol field (8-bits), and Transport-Layer protocol flags (8-bits) with a total of 120 bits. The number of fields and total width of the fields may increase for future applications.
Rules can be represented in a number of ways including exact number match, prefix match, range match, and wildcard match. Wildcard match was chosen to be the only method of rule representation that did not sacrifice generality. Any other forms of matching are translated into one or multiple wildcard match rules. A wild card match rule is defined as a ternary string, where each bit can take one of three possible values: ‘1’, ‘0’, or ‘x’. A bit of ‘1’ or ‘0’ in the rule requires the matching search key bit in the corresponding position to have exactly the same value, and a bit of ‘x’ bit in the rule can match either ‘0’ or ‘1’ in the search key.
An example of a rule specification on a 16-bit field is given below:
The classifier wants to match1111 0000 xx1x 0xx1
The mask is:1111 1111 0010 1001
The target value is:1111 0000 0010 0001
Prefix match rules can be represented in wildcard rules naturally by contiguous ‘x’ bits in the rules. However the don't-care bits in a general wildcard do not have to be contiguous. Ranges or multiple disjoint point values may be defined by using multiple masked matching rules. For example, an 8-bit range must be broken into two masked matching rules ‘00010xxx’ and ‘00110xx’. Even with this limitation, the masked matching form is still considered to be an efficient representation, because most of the ranges in use can be broken down into a small number of mask rules. A compiler can handle the task of breaking down user rule specification in a convenient syntax, therefore the complexity can be hidden from the user.
Each rule represents a region in the multi-dimensional space. Each search key (representing a packet to be classified) defines a point in this space. Points that fall into one region are classified as a member of the associated class. Ambiguity arises when multiple regions overlap each other. A single priority order is defined among the rules to resolve the ambiguity. The rules are numbered from 0 to N−1. The rule indices define the priority among the rules in ascending order. The region with higher priority will cover the region with lower priority. In other words, if a packet satisfies both rule[i], and rule[j], if i<j, it is classified into class[i], otherwise into class[j].
One advantage of mask matching is its dimension independence. Multiple fields concatenated can be classified with the same method as if they were one wide field. This is accomplished by concatenating the masks of the target strings.
The prior solutions can be grouped into the following categories:
Sequential Match
For each arriving packet, this approach evaluates each rule sequentially until a rule is found that matches all the fields of the search key. While this approach is simple and efficient in use of memory (memory size grows linearly as the size of the rule set increases), this approach is unsuitable for high-speed implementation. The time required to perform a lookup grows linearly with rule set size.
Grid of Tries
The ‘Grid of Tries’ (or Tuple Space Search) uses an extension of tries data structure to support two fields per search key. This is a good solution for a two-dimensional rule set. But it is not easy to extend the concept to more fields.
The cross-producing scheme is an extension of the ‘Grid of Tries’ that requires a linear search of the database to find the best matching filter. Hence the effectiveness of cross-producing is not clear. The grid of tries approach requires intensive precompute time. The rule update speed is slow.
A scheme based on tries is presented by Douceur et al. in U.S. Pat. Nos. 5,995,971 and 5,956,721. This method utilizes a tri-indexed hierarchy forest (“Rhizome”) that accommodates wildcards for retrieving, given a specific input key, a pattern stored in the forest that is identical to or subsumes the key. This approach has the weakness of not supporting “conflict” between patterns (as stated in line 21˜26, column 22 of U.S. Pat. No. 5,995,971). Patterns that partially overlap but do not subsume one another (E.g. pattern “100x” and “1x00”) are in “conflict” because they overlap each other partially, may not be stored in the rhizome defined by the patent, since no defined hierarchical relationship holds for these patterns. In networking applications, these conflicts widely exist in router access list and firewall policies. This weakness limits the use of this classification scheme.
Concurrent Cross Producing
T. V. Lakshman in “High Speed Policy-Based Packet Forwarding Using Efficient Multi-Dimensional range Matching”, Proceedings of ACM SIGCOMM'98 Conference, September, 1998, presented a hardware mechanism for concurrent matching of multiple fields. For each dimensional matching this scheme does a binary search on projections of regions on each dimension to find the best match region. A bit-level parallelism scheme is used to solve the crossproducing problem among dimensions. The memory size required by this scheme grows quadratically and memory bandwidth grows linearly with the size of the rule set. Because of the computation complexity in the cross-producing operation, this scheme has a poor scaling property. This scheme also requires a time consuming data structure generation process, hence the rule update speed is slow.
Ternary CAM
Hardware Ternary CAMs (Content Addressed Memory) can be used for classification. Ternary CAMs store three value digits: ‘0’, ‘1’ or ‘X’ (wildcard). The CAMs have good look-up performance, and fast rule update time. But the hardware cost (silicon area) and power consumption are high. More over, the CAMs require full-custom physical design that prevents easy migration between different IC technologies. For these reasons, currently available CAMs are typically small.
Recursive Flow Classification
The recursive flow classifier (RFC) as discussed in Pankaj Gupta and Nick Mckeown, “Packet Classification on Multiple Fields”, Sigcomm, September 1999, Harvard University and Pankaj Gupta and Nick Mckeown, “Packet Classification using Hierarchal Intelligent Cuttings”, Proc. Hot Interconnects VII, August 99, Stanford, exploits the heuristics in typical router policy database structure(router microflow classifier, access list, fireware rules). RFC uses multiple reduction phases; each step consisting of a set of parallel memory lookups. Each lookup is a reduction in the sense that the value returned by the memory lookup is shorter (is expressed in fewer bits) than the index of the memory access. The algorithm can support very high lookup speed at relatively low memory bandwidth requirement. Since it relies on the policy database structure, in the worst case, little reduction can be achieved at each step. Hence the performance becomes indeterministic. In a normal case, the lookup performance gain is achieved at the cost of high memory size and very long precomputation time. For a large ruleset (16K), the RFC precompute time exceeds the practical limit of a few seconds. In general, RFC is suitable for small classifiers with static rules.
All of the above methods involve, in one form or another, large computation or lookup times. Whichever method is implemented, the cost in time and complexity eventually increases to unacceptable levels. Furthermore, whichever method is used, the whole search space of possible candidate bit patterns is searched for a match with the target bit patterns. What is required is a method which reduces the search space for whichever mask matching method is implemented.