1. Field of the Invention
This invention relates to computer networking. More particularly, the invention relates to the use of network search engines (NSEs) for packet classification and forwarding.
2. Description of the Related Art
Computer networking is generally recognized as the communication of packets across an interconnected network of computers. One objective of networking is to quickly forward the packets from a source to a destination. Thus, one or more forwarding devices may be placed within the network for performing such a function. As used herein, the term “forwarding devices” can be used interchangeably to refer to gateways, bridges, switches, or routers.
A forwarding device typically includes a lookup table (or “routing table”) containing a representation of at least a portion of the network topology, as well as current information about the best known paths (or “routes”) from the forwarding device to one or more destination addresses. For example, a forwarding device may store address prefixes (or “prefix entries”) and next hop identifiers in a routing table. The prefix entries generally represent a group of destination addresses that are accessible through the forwarding device, whereas next hop identifiers represent the next device along the path to a particular destination address. Other information may be stored within the routing table, such as the outgoing port number, paths associated with a given route, time out values and one or more statistics about each route.
When an incoming address is received by a forwarding device, the address is compared to the prefix entries stored within the routing table. If a match occurs, the packet of information associated with the address is sent to an appropriate output port of the forwarding device. As links within the network change, routing protocols sent between the forwarding devices change the prefix entries within the corresponding routing tables. This change will not only modify the prefix entries within the routing table, but also the next-hop identifiers pointed to by those prefix entries. Thus, routing through the forwarding devices can be dynamically changed (i.e., updated) as links go down and come back up in various parts of the network.
The Internet Protocol (IP) is the protocol standard most widely used for packet communication to and from the Internet. Internet Protocol (IP) addresses associated with a packet generally comprise a network field (for identifying a particular network) and a host field (for identifying a particular host on that network). All hosts on the same network will have the same network field but different host fields. The number of bits dedicated to the network and host fields may vary from class to class in a class-based Internet addressing architecture. With the advent of Classless Inter-Domain Routing (CIDR), a classless addressing architecture, the boundary between the network field and the host field may also vary.
In addition to class-based and classless addressing architectures, there are currently several versions of IP addressing. For instance, IP version 4 (IPv4) uses a 32-bit addressing prefix, whereas IP version 6 (IPv6) uses a 128-bit addressing prefix. If, for example, IPv4 addressing is used, the forwarding device might only consider the first 8, 16 or 24 bits of the 32-bit addressing field in determining the next hop. The number of bits considered by the forwarding device may be referred to herein as the prefix length (p).
A popular way to determine the next hop is to use a technique known as longest-matching prefix. In this technique, a 32-bit IP address of, for example, 192.2.8.64 is compared against a prefix entry (or “prefix”) within the routing table. The prefix 192.2.0.0/16 has a longer matching prefix than prefix 192.0.0.0/8. This is due primarily to the prefix length in the former being 16 bits, and the prefix length in the latter being only 8 bits. When employing the longest matching prefix technique, the forwarding device will initially consider the first two bytes of 192.2* to determine the next hop address at which to send the packet.
There are many ways to perform a longest-matching prefix comparison. For example, pointers or hashes may be used to divide the routing table into a plurality of sub-databases, each representing a different route through the network. To locate individual sub-databases, the first few bits of a binary prefix entry can be stored as a pointer within a pointer table. Each pointer entry keeps track of the prefixes within a particular sub-database, and points to subsequent binary entries needed to complete the longest prefix match. Unfortunately, many routes (empty routes) pointed to by the pointer entry may never be used (i.e., never compared with the incoming address). Moreover, while some routes (sparse routes) might seldom be used, other routes (dense routes) are used more often. While pointers will point to possibly hundreds of prefixes within the sub-databases, many sub-databases may be empty or sparse of any prefix entries matching the incoming addresses. Dividing a database of prefixes using precursor pointers, while heuristic, does not assure that the databases will be optimally divided.
Another technique used to divide a database may involve the use of a tree (or “trie”) structure. There are many different tree configurations. A simple tree is often referred to as a binary tree, with more complex trees being compressed forms of the binary tree. To search for an address within a tree, the search begins at a root node. Extending from the root node, a “1” pointer or a “0” pointer is followed to the next node, or the next binary bit position, within the tree. If, for example, the address begins with 001*, then the search begins at the root node and proceeds downward to each vertex node, beginning along the “0” branch pointer to the next “0” branch pointer, and finally to the “1” branch pointer. The search will continue until a leaf node is reached or a failure occurs. In some cases, the binary tree may be compressed to enhance the search operation. A Patricia tree is one form of compression used to shorten the length of a branch to having relatively few leaf nodes.
One disadvantage of the longest-matching prefix search techniques described above is that their algorithms do not take into account that certain sub-databases or branches may rarely be searched while others are predominantly searched. While a tree proves helpful in locating prefixes within the leaf nodes, a precondition of searching a tree is that before the next node can be fetched, the previous nodes must be retrieved. Empty or sparse routes may, therefore, result in a relatively slow search, and thus, a relatively slow lookup operation.
The speed with which a search or lookup operation is performed could be increased if the prefix entries within each node (or searchable sub-database) were more optimally apportioned. Co-pending application Ser. No. 10/402,887 describes a system and method for configuring sub-databases within the overall forwarding database of the routing table. Generally speaking, the co-pending application describes how a forwarding database may be optimally apportioned by placing bounds on the number of prefixes within each sub-database, and bounds on the number of sub-databases within the routing table. By controlling the number of sub-databases and the sizes of the sub-databases, lookup operations are more deterministic, and worst-case lookup times can be guaranteed. Moreover, the bounded number of sub-databases can be more optimally apportioned to a physical device, such as a memory, with dedicated portions of the memory appropriately sized to accommodate a corresponding sub-database. This may ultimately lessen the amount of power consumed by the lookup operation since only one sub-database need be accessed during a particular lookup.
Routing protocols, such as the Border Gateway Protocol (BGP) or the Open Shortest Path First (OSPF) protocol, compute routing tables on the basis of the network topology—e.g., the routers forming the network, the connectivity graph of the intervening links, and the distance between the routers in terms of the number of hops. As used herein, the term ‘routers’ will also be interpreted to include ‘switches’ and any other devices deemed to be “forwarding devices”. Since routing tables are intended to reflect current network conditions, routing tables must be changed or updated as the network topology changes, which happens, e.g., when routers and links fail or come back up. These changes are usually incremental modifications (e.g., adds or withdrawals) to the current routing table at an affected router, and are referred to herein as “route updates”.
To reflect a change in network topology, the following steps may be performed by an affected router (or another “forwarding device”). In a first step, the routing protocol (such as BGP or OSPF) is used to recompute the affected routes. This recomputation is performed by protocol software in the control plane of the affected router(s), and typically uses a shortest path routing algorithm. However, the recomputation may take a substantial amount of time to “converge” (i.e., to return the best match). For example, the performance of the first step may depend on the exact change in network topology and the routing protocol under deployment.
Most modern routers use a different version of the routing table, called a “forwarding table”, which is computed from the routing table by the forwarding software in the control plane, and then downloaded to hardware components in the data plane for faster processing of data packets. Therefore, any changes made to the routing table, need to be reflected in the forwarding table in the router hardware. This constitutes a second step in the update process. Data packets passing through the affected router can then use the new routes in the updated forwarding tables.
The performance of the second step generally depends on the mechanism by which the forwarding table is computed and updated from the routing table, and is directly determined from the particular forwarding solution being used. A variety of forwarding solutions are currently used to store and search for routes in the forwarding table. For example, a network search engine (NSE), such as a TCAM-based (Ternary Content Addressable Memory) search engine, may be used for storing and searching through the forwarding table. Other network search engines may be implemented as off-chip memory with either (i) on-chip custom-designed logic, or (ii) software running on a specialized packet processor for implementing one or more forwarding algorithms. An off-the-shelf search engine may also be used for running one or more forwarding algorithms and may include embedded memory for storing routes.
Conventional architectures used for NSEs allow a system designer to trade-off certain parameters, such as power consumption, throughput, capacity, update rate and latency, when tailoring the search engine to a particular application. Because these parameters are traded against each other, however, conventional architectures do not permit a system designer to achieve desirable values (such as, e.g., low power consumption, high throughput, high capacity, high update rates and fixed latency) for all of the parameters simultaneously. For example, a conventional TCAM-based NSE may demonstrate relatively high throughput, high update rates and fixed latency, but may also consume large amounts of power in doing so. Likewise, a conventional trie-based algorithmic NSE may have to sacrifice capacity to maintain high update rates and fixed latency. None of the conventional methods (whether algorithmic or not) are able to achieve high performance in all of the above-mentioned parameters simultaneously.
It would be desirable, therefore, to provide an NSE architecture that could simultaneously achieve low power, high capacity (including, e.g., high worst case capacity for specific applications of interest), high search throughput, high update rates and fixed search latency (for all search key widths). Conventional methods simply cannot achieve desirable values for all of the parameters mentioned above.