Packet classification is the core mechanism that enables many networking devices, such as routers and firewalls, to perform services such as packet filtering, quality of service, traffic monitoring, virtual private networks (VPNs), network address translation (NAT), load balancing, traffic accounting and monitoring, differentiated services (Diffserv), etc. The fundamental problem is to compare each packet with a list of predefined rules, which we call a packet classifier, and find the first (i.e., highest priority) rule that the packet matches. Table 1 shows an example packet classifier of three rules. The format of these rules is based upon the format used in Access Control Lists (ACLs) on Cisco routers. In this paper we use the terms packet classifiers, ACLs, rule lists, and lookup tables interchangeably.
Hardware-based classification using Ternary Content Addressable Memories (TCAMs) is the de facto industry standard. Whereas a traditional random access memory chip receives an address and returns the content of the memory at that address, a TCAM chip does the converse: it receives content and returns the address of the first entry where the content lies in the TCAM in constant time (i.e., a few clock cycles). Exploiting this hardware feature, TCAM-based packet classification stores a rule in a TCAM entry as an array of 0's, 1's, or *'s (don't-care values). A packet header (i.e., a search key) matches an entry if and only if their corresponding 0's and 1's match. Given a search key to a TCAM, the circuits compare the key with all its occupied entries in parallel and return the index (or the content, depending on the chip architecture and configuration,) of the first matching entry. TCAM-based classification is widely used because of its high speed. Although software based classification has been extensively studied, these schemes cannot match the wire speed performance of TCAM-based packet classification systems.
Although TCAM-based packet classification is the de facto industry standard because packets can be classified in constant time, the speed and power efficiency of each memory access decreases significantly as TCAM chip capacity increases. Packet classification with a single TCAM lookup is possible because of the parallel search and priority match circuits in a TCAM chip. Unfortunately, because the capacity of the TCAM chip determines the amount and depth of circuitry active during each parallel priority search, there is a significant tradeoff between the capacity of a TCAM chip and the resulting speed and power efficiency of that chip. For example, based on the detailed TCAM power model disclosed by B. Agrawal and T. Sherwood in “Modeling team power for next generation network devices” In Proc. IEEE International Symposium of Performance Analysis of Systems and Software (2006), a single search on a 36 megabit (Mb) TCAM chip, the largest available, takes 483.4 nanojoules (nJ) and 46.9 nanoseconds (ns), whereas the same search on a 1 Mb TCAM chip takes 17.8 nJ and 2.1 ns.
Building an efficient TCAM-based packet classification system requires careful optimization of the size, speed, and power of TCAM chips. On one hand, there is pressure to use smaller capacity TCAM chips because small TCAM chips consume less power, generate less heat, occupy less line card space, have a lower price, and support faster lookups. TCAM chips consume a large amount of power due to their parallel searching. The power consumed by a TCAM chip is about 1.85 Watts per megabit (Mb), which is roughly 30 times larger than a comparably sized SRAM chip. The high power consumption consequently causes TCAM chips to generate a huge amount of heat. TCAM chips have large die areas. A TCAM chip occupies 6 times (or more) board space than an equivalent capacity SRAM chip. The large die area of TCAM chips leads to TCAM chips being very expensive, often costing more than network processors. Although the limited market size may contribute to TCAM's high price, it is not the main reason. Finally, as we noted earlier, smaller TCAM chips support much faster lookups than larger TCAM chips.
On the other hand, there is pressure to use large capacity TCAM chips. The first reason is that encoding packet classification rules into TCAM rules often results in an explosion in the number of rules, which is referred to as the range expansion problem. In a typical classification rule, the fields of source and destination IP addresses and protocol type are specified as prefixes, so they can be directly stored in a TCAM. However, the fields of source and destination port numbers are specified in ranges, which need to be converted to one or more prefixes before being stored in a TCAM. This can lead to a significant increase in the number of TCAM entries needed to encode a rule. For example, 30 prefixes are needed to represent the single range [1, 65534], so 30 ×30=900 TCAM entries are required to represent the single rule r1 in Table 1 below.
Source IPRuleProtocolDest. IPSource PortDest. PortActionr13.2.1.0/24192.168.0.1[1, 65534][1, 65534]discardTCPacceptr2*****The second reason to use large TCAM chips is that packet classifiers are growing rapidly in length and width due to several causes. First, the deployment of new Internet services and the rise of new security threats lead to larger and more complex packet classification rule sets. While traditional packet classification rules usually examine the five standard header fields, new classification applications examine additional fields such as classified, protocol flags, ToS (type of service), switch port numbers, security tags, etc. Second, with the increasing adoption of IPv6, the number of bits required to represent source and destination IP addresses will grow from 64 to 256. The growth of packet classifier length and width puts more demand on TCAM capacity, power consumption, and heat dissipation.
Range reencoding schemes have been proposed to improve the scalability of TCAMs, primarily by mitigating the effect of range expansion. The basic idea is to first reencode a classifier into another classifier that requires less TCAM space and then reencode each packet correspondingly such that the decision made by the reencoded classifier for the reencoded packet is the same as the decision made by the original classifier for the original packet. Range reencoding has two possible benefits: rule width compression so that narrower TCAM entries can be used and rule number compression so that fewer TCAM entries can be used.
In another aspect of this disclosure, we observe that all previous reencoding schemes suffer from one fundamental limitation: they all ignore the decision associated with each rule and thus the classifier's decision for each packet. Disregarding classifier semantics leads all previous techniques to miss significant opportunities for space compression. Fundamentally different from prior work, we view reencoding as a topological transformation process from one colored hyperrectangle to another where the color is the decision associated with a given packet. Topological transformation allows us to reencode the entire classifier as opposed to reencoding only the ranges in a classifier. Furthermore, we also view reencoding as a classification process that can be implemented with small TCAM tables, which enables fast packet reencoding. We present two orthogonal, yet composable, reencoding approaches: domain compression and prefix alignment. In domain compression, we transform a given colored hyperrectangle, which represents the semantics of a given classifier, to the smallest possible “equivalent” colored hyperrectangle. This leads to both optimal rule width compression as well as rule number compression. In prefix alignment, on the other hand, we strive for rule number compression only by transforming a colored hyperrectangle to an equivalent “prefix-friendly” colored hyperrectangle where the ranges align well with prefix boundaries, minimizing the costs of range expansion.
This section provides background information related to the present disclosure which is not necessarily prior art.