Linear sorting techniques may be utilized to arrange a plurality of search prefixes (a/k/a search “keys”) within an integrated circuit search engine device. One such linear sorting technique is based on the starting address of a prefix range associated with each search prefix. In the event a plurality of the search prefixes have the same starting address but different prefix lengths, then a search prefix with a shorter prefix length may be treated as being “less than” a search prefix with a longer prefix length. One example of a plurality of 8-bit search prefixes is illustrated by TABLE 1.
The search prefixes in TABLE 1 may be sorted linearly by prefix value and prefix length, as shown in FIG. 1, with the smallest search prefix (e.g., A:0/0) located on the left side of the array 10 and the largest search prefix (e.g., M:240/5) located on the right side on the array 10. To perform a linear search (i.e., lookup) operation, an applied search key is compared with every search prefix in the array 10, starting with the search prefix on the left side of the array 10, until a search prefix is found with a start address that is greater than the applied search key. Each search prefix in the array 10 that matches the applied search key is a potential longest prefix match. Once the search operation terminates at the right side of the array 10 (or at a search prefix with a start address than is greater than the applied search key), the rightmost search prefix that matches the search key is treated as the longest prefix match (LPM).
TABLE 1ID KEYIDKEYIDKEYA0/0I240/4Q168/6B0/1J128/2R170/8C0/2K208/5S120/5D0/3L128/1T0/5E0/4M248/5U192/2F144/4N160/4V64/2G192/3O96/3H224/3P112/4
This search operation is an iterative process, with each search prefix being compared in sequence with the applied search key. As illustrated by FIG. 2, this process can also be implemented in a hardware-based array 20, by simultaneously comparing the applied search key (e.g., 171) to all of the search prefixes within the array 20, using a plurality of comparators 22 that generate match and non-match signals. In particular, each match between the applied search key and a search prefix results in the generation of a match signal (M) and each non-match results in the generation of a “less than” signal (LT) or a “greater than” signal (GT). The comparators 22 may generate these signals as two-bit binary signals (e.g., M=11b, LT=01b, and GT=10b). The longest prefix match is represented by the search prefix associated with the rightmost match signal M, which in FIG. 2 is represented by the search prefix Q:168/2. This longest prefix match may be identified using a priority encoder (not shown) that is configured to receive the signals generated by the comparators 22.
Conventional network routing applications may also utilize tree data structures to support search operations within an integrated circuit device. These tree data structures may include b−tree structures that are kept balanced to prevent one or more branches of the tree from becoming longer that other branches of the tree and thereby increasing search latency. FIG. 3 illustrates a three-level b−tree data structure 30 containing the search prefixes of TABLE 1 and the array 20 of FIG. 2. This b−tree 30 is illustrated as including six leaf nodes at Level 2 (i.e., Nodes 2-0, 2-1, 2-2, 2-4, 2-5 and 2-6), two leaf nodes at Level 1 (Node 1-0 and 1-1) and a root node at Level 0 (Node 0-0).
As illustrated by the highlighted search path, a search of the b−tree using 171 as an applied search key begins at Node 0-0. The search prefix J at Node 0-0 represents a match with the search key 171 because 171 (i.e., 10101011b) is a match with 128/2 (i.e., 10XXXXXX) where X represents a “don't-care” value. The search then proceeds to Node 1-1 (i.e., along a right-side branch from Node 0-0 to Node 1-1) because 171 is greater than 128. No matches are present at Node 1-1 because the search key 171 (i.e., 10101011b) does not match either the search prefix R:170/8 (10101010b) or the search prefix H:224/3 (i.e., 111XXXXX). Because the search key 171 is greater than 170 and less than 224, the search then proceeds to and terminates at Node 2-5, which is a leaf node of the b−tree 30. None of the search prefixes U:192/2, G:192/3 or K:208/5 at Node 2-5 represent a match with the search key 171. Thus, based on the illustrated search path, which traverses Nodes 0-0, 1-1 and 2-5 of the b−tree 30, only search prefix J:128/2 represents a matching entry within the search key 171. However, as illustrated best by FIG. 2, the search prefix Q:168/6, which resides at Node 2-4 of FIG. 3, actually represents the longest prefix match with the search key 171, yet this search prefix was not within the search path and was not detected during the search operation. Moreover, the search prefixes A:0/0, L:128/1 and N:160/4 also represent matches that were not within the search path. This means that the conventional sorting of prefixes within the b−tree 30 of FIG. 3 will not yield correct results for all applied search keys.
Another example of a b−tree data structure is described in U.S. Pat. No. 6,490,592, which is assigned to Nortel Networks Limited. As described at Col. 1 of the '592 patent, conventional b−tree data structures may not be well suited for search operations that require identification of longest prefix matches (LPMs) within the b−tree data structure. To address this limitation, the '592 patent describes a modified b−tree data structure that is arranged so that data elements stored therein, which have no overlapping prefixes, are arranged in a standard b−tree structure. However, other data elements that have overlapping prefixes are arranged in a modified structure so that the prefix of such a data element contains the prefixes of all such data elements that succeed it in the b−tree. This modified structure is referred to as an L-structure. FIG. 3 of the '592 patent shows portions 300 and 340 that includes a b−tree into which an L-structure 320 is inserted. Unfortunately, the use of L-structures within a b−tree may represent a form of prefix nesting that reduces a likelihood of achieving ideal b−tree properties that typically reduce search latency and result in efficient utilization of memory space. In particular, for a fixed memory capacity and latency, which is related to tree height, the number of search prefixes that can be supported within the b−tree of the '592 patent is statistically dependent on the degree of nesting within the prefix data set supported by the b−tree. Accordingly, prefix data sets that require a high degree of nesting may result in an inefficient utilization of the memory space that is required to maintain the b−tree.
A network address processor that supports longest prefix match lookup operations is disclosed in U.S. Pat. No. 7,047,317 to Huie et al. In particular, FIGS. 2-3 of the '317 patent illustrate a lookup engine that supports an M-way tree data structure. This data structure includes a plurality of lookup tables, with each lower stage table providing an index to a key within a next higher stage table.
An additional type of b−tree data structure includes a b*tree data structure, which can require non-root nodes to be at least ⅔ full at all times. To maintain this fill requirement, a sibling node is not immediately split whenever it is full. Instead, keys are first shared between sibling nodes before node splitting is performed. Only when all sibling nodes within a group are full does a node splitting operation occur upon insertion of a new search key. FIG. 12 illustrates a conventional three-level b*tree data structure. These three levels are illustrated as L0, L1 and L2, where L0 is treated as the root level and L2 is treated as a leaf level. Level L1 is an intermediate level, which is a child relative to the root level and a parent relative to the leaf level. As will be understood by those skilled in the art, a b*tree of type N:(N+1) (i.e., 2:3, 3:4, 4:5, . . . ) requires all non-root nodes to be between N/(N+1) and 100% capacity (i.e, 67%, 75%, 80%, . . . up to 100%) before and after an insert or delete operation has been fully performed. The b*tree of FIG. 12 is a 3:4 tree, with four key locations per node (i.e., M=4).
FIG. 13A illustrates a portion of a b*tree with excess capacity having three sibling nodes at a leaf level and a parent node (at the root level) containing the search keys A-K, which represent numeric search key values. The leftmost sibling node contains the search keys A, B and C, the middle sibling node contains the search keys E, F and G and the rightmost sibling node contains the search keys I, J and K. The parent node contains the search keys D and H. These sibling nodes are at 75% capacity, which meets the requirement that all non-root nodes be between N/(N+1) and 100% capacity for a 3:4 type b*tree, where N=3. As illustrated by FIG. 13B, an insertion of the key L into the b*tree of FIG. 13A increases the rightmost sibling node to full capacity without affecting the other two sibling nodes. The additional insertion of key M into the rightmost sibling node in the b*tree of FIG. 13B causes the transfer of key Ito the parent node and the transfer of key H from the parent node to the middle sibling node, as illustrated by FIG. 13C.
FIG. 13D illustrates the further insertion of node N into the rightmost sibling node, which causes an overflow that ripples through the parent and middle sibling nodes into the leftmost sibling node, which is now at full capacity. In FIG. 13E, a split between the sibling nodes and an increase in population of the parent node occurs in response to the further insertion of key O into the rightmost sibling node. This split from three to four sibling nodes is necessary to maintain a capacity of all non-root nodes in a range from 75% to 100% capacity, for N=3.
FIGS. 14A-14D illustrate three insertion examples that result in the splitting of sibling nodes having no excess capacity. As illustrated by FIG. 14A, the insertion of any additional key (#) into a b*tree with sibling nodes at full capacity results in a split among the sibling nodes and a repopulation of these nodes at equivalent levels (shown at 75%). In FIG. 14B, the insertion of key D+ into the leftmost sibling node results in a split that causes keys D, G and K to move to the parent node (displacing keys E and J) and a grouping of keys D+, E and F together in a sibling node. In FIG. 14C, the insertion of key I+ into the middle sibling node results in a split that causes keys D, H and K to move to the parent node (displacing keys E and J) and a grouping of keys I, I+ and J together in a sibling node. Finally, in FIG. 14D, the insertion of key N+ into the rightmost sibling node results in a split that causes keys D, H and L to move to the parent node (displacing keys E and J) and a grouping of keys M, N and N+ together in a rightmost sibling node. Thus, as illustrated by FIGS. 14B-14D, the value of the search key to be inserted into sibling nodes having no excess capacity influences the nature of the overflow and regrouping of keys during an operation to split the sibling nodes. This means that conventional hardware to perform insert operations may need to account for every possible insert location amongst the plurality of sibling nodes.