Electronic communication networks comprise network routers that are capable of sending and receiving electronic data in packets. Each network router directs an incoming information packet to the next neighboring router that is on a route of the intended destination of the packet. Each network router has to perform prefix lookup operations on a routing table in order to determine the appropriate “next hop” address for the packet according to the destination IP (Internet Protocol) prefix of the packet.
The prefix lookup operations are done either by a network processor or, more commonly, by a separate device that is referred to as a Network Search Engine. The task of a network processor or a Network Search Engine is maintain and perform searches on a routing table that consists of destination prefixes and their associated “next hop” information. An exemplary prior art router system 100 is shown in FIG. 1. A packet enters an Ingress Unit 110 and is passed to a Network Processing Unit (NPU) 120. NPU 120 is coupled to Back Plane Unit 130 and to Network Search Engine (NSE) 140. NPU 120 sends a search key for the packet to NSE 140. NSE 140 performs a search of a routing table (not shown) within NSE 140 and returns the “next hop” information to NPU 120. NPU 120 then sends the packet to its “next hop” destination through Egress Unit 150.
Various types of network search engines exist that are capable of performing the task of searching a routing table. The present invention is directed to improvements in a network search engine of the type that is capable of using pipelined hardware that employs multiple banks of memories to implement bitmapped multi-bit trie algorithmic search algorithms. FIG. 2 illustrates a block diagram of a prior art pipelined hardware bitmapped multi-bit trie algorithmic network search engine 200. Network search engine 200 comprises an input interface 210, an initial logic unit 220, a plurality of pipelined logic units (230, 240), a plurality of memory banks (250, 260, 270) and an output interface 280.
In a typical pipelined hardware network search engine such as that shown in FIG. 2 a search on a search key is done in stages. The pipeline logic in each stage of the pipelined logic units (230, 240) processes some portion of the search key. As the bits in the search key are examined, a decision is made (1) to terminate the search because either a final match is found or no match is found, or (2) to continue to the next stage. The search is continued to the next stage by generating and sending an address to an associated memory bank and performing a memory read.
If the decision is to continue the search, the data that is read from the memory is sent to pipeline logic unit of the next stage and the next portion of the search key is processed. The search process continues until either a final match is found or no match is found.
Various software algorithms have been developed to reduce the amount of memory that must be used to store the routing tables and to reduce the number of memory accesses that must be made during lookup operations. A “trie” is a digital search tree data structure and algorithm that represents binary strings in which the bits in a string determine the direction of the branches of the search tree. The term “trie” is taken from the middle four letters of the word “retrieval.” A trie algorithm hierarchically organizes the destination IP prefixes (according to the numeric value of the prefixes) into an easily searchable tree structure.
A binary trie has at most two branches, while a multi-bit trie consumes multiple bits at a time and has several branches. Each branch of a multi-bit trie leads to the next level. The number of bits consumed in multi-bit trie is referred to as a “stride.” A uniform width stride trie is a trie in which all of the strides have the same width (except for the last stride which is the remainder of the prefix length divided by the stride width.) A multi-bit trie algorithm works by storing and retrieving the prefixes in a uniform width stride trie or in a variable width stride trie.
The multi-bit trie bitmap algorithm groups all branches in the same level with the same “parent” stride value in a table. This is called a “trie table.” If the prefix is divided into an array of n-bit strides, the maximum possible entries in the next level trie table is 2n. The next level trie table is sometimes referred to as a “child” trie table. The algorithm encodes all next level stride values from the same parent into a 2n-bit data field and stores it in the entry in the parent trie table, along with the base address of the next level (“child”) trie table. The data structure storing this information is called a “trie-node.”
Table compression is achieved by allocating memory for the actual number of entries that exist, instead of the maximum size of 2n. For the last stride of each prefix, a similar type of data structure is used, except in this case the pointer is pointing to a table containing “next hop” information, instead of a next level trie table. This type of entry is called an “end-node.”
Routing table lookup is also performed in same width strides. The value in the next level stride is decoded and processed with the data field in its parent table entry. If it is determined that a stored route with the same stride value exists, an index is calculated using the information. The table pointer and this index form an address leading to the next level trie table entry and the search continues. If a match is not found, the search terminates without success. If a search reaches an end-node and a match is found, the search is a success and the associated “next hop” information is read from the “next hop” table.
In a typical routing table other prefixes with common high order bits (strides) will share parent trie tables. This reduces the amount of memory required to store the prefixes. Also in a typical routing table sometimes there are many single entry intermediate trie-tables (non-leaf tables). This happens when a number of prefixes share a string of high order bits (strides). During lookup operations, these series of trie tables are accessed, one after another, until a matching leaf entry is found or until an end of the link is encountered.
In a prior art software based packet lookup devices such as a network processor unit, a method called “path compression” is used to reduce the level of trie tables to be accessed in the case of single entry trie tables. Path compression works by replacing a series of non-leaf single entry trie tables with the actual prefix patterns (strides) that the bitmaps in the trie tables represent and put into one location, along with the binary lengths (skip counts) of the patterns. Therefore during the search the network processor unit can perform only one memory access to retrieve the prefix pattern and determine the matching status of a multiple “stride-ful” of bits, instead of doing multiple memory accesses to a series of trie tables to check the same number of strides. This approach, however, has not been used in a pipelined hardware based device because of its apparent inconsistency with the normal pipeline flow.
Consider a prior art pipelined hardware search engine that does not use “path compression.” The method employed by such a search engine for handling single entry trie tables wastes memory space, memory bandwidth, and power. For example, consider that the content of a trie table typically comprises a table header and one or more table entries. The table header holds a backtrack pointer (an address of the parent trie table entry). In the case in which a single entry trie table exists, the parent entry that points to a single entry table holds a bitmap with only one set-bit. The child trie table consists of one table header and one data entry. In a series of single entry trie tables, each subsequent child table's bitmaps also has one set-bit. A prior art pipelined hardware search engine uses memory space, memory bandwidth, and power to handle these types of single entry trie tables. If these tables could be eliminated the memory accesses to the otherwise single entry tables would also be eliminated. This would result in a significant savings in memory space, memory bandwidth, and power.
Therefore, there is a need in the art for an apparatus and method for optimizing path compression of single entry trie tables. There is a need in the art of an apparatus and method for saving memory space occupied by single entry trie tables and for saving memory bandwidth and power associated with accessing single entry trie tables in pipelined hardware network search engines.