Linear sorting techniques may be utilized to arrange a plurality of search prefixes (a/k/a search “keys”) within an integrated circuit search engine device. One such linear sorting technique is based on the starting address of a prefix range associated with each search prefix. In the event a plurality of the search prefixes have the same starting address but different prefix lengths, then a search prefix with a shorter prefix length may be treated as “less than” a search prefix with a longer prefix length. One example of a plurality of 8-bit search prefixes is illustrated by TABLE 1.
The search prefixes in TABLE 1 may be sorted linearly as shown in FIG. 1, with the smallest search prefix having the shortest prefix length (e.g., A:0/0) located on the left side of the array 10 and the largest search prefix with the longest search prefix (e.g., M:240/5) located on the right side on the array 10. To perform a linear search (i.e., lookup) operation, an applied search key is compared with every search prefix in the array 10, starting with the search prefix on the left side of the array 10, until a search prefix is found with a start address that is greater than the applied search key. Each search prefix in the array 10 that matches the applied search key is a potential longest prefix match. Once the search operation terminates at the right side of the array 10 (or at a search prefix with a start address than is greater than the applied search key), the rightmost search prefix that matches the search key is treated as the longest prefix match (LPM).
TABLE 1IDKEYA 0/0B 0/1C 0/2D 0/3E 0/4F144/4G192/3H224/3I240/4J128/2K208/5L128/1M248/5N160/4O 96/3P112/4Q168/6R170/8S120/5T 0/5U192/2V 64/2
This search operation is an iterative process, with each search prefix being compared in sequence with the applied search key. As illustrated by FIG. 2, this process can also be implemented in a hardware-based array 20, by simultaneously comparing the applied search key (e.g., 171) to all of the search prefixes within the array 20, using a plurality of comparators 22 that generate match and non-match signals. In particular, each match between the applied search key and a search prefix results in the generation of a match signal (M) and each non-match results in the generation of a “less than” signal (LT) or a “greater than” signal (GT). The comparators 22 may generate these signals as two-bit binary signals (e.g., M=11b, LT=01b, and GT=10b). The longest prefix match is represented by the search prefix associated with the rightmost match signal M, which in FIG. 2 is represented by the search prefix Q:16812. This longest prefix match may be identified using a priority encoder (not shown) that is configured to receive the signals generated by the comparators 22.
Conventional network routing applications may also utilize tree data structures to support search operations within an integrated circuit device. These tree data structures may include b-tree structures that are kept balanced to prevent one or more branches of the tree from becoming longer than other branches of the tree and thereby increasing search latency. FIG. 3 illustrates a three-level b-tree data structure 30 containing the search prefixes of TABLE 1 and the array 20 of FIG. 2. This b-tree 30 is illustrated as including six leaf nodes at Level 2 (i.e., Nodes 2-0, 2-1, 2-2, 2-4, 2-5 and 2-6), two intermediate nodes at Level 1 (Node 1-0 and 1-1) and a root node at Level 0 (Node 0-0).
As illustrated by the highlighted search path, a search of the b-tree using 171 as a search key begins at Node 0-0. The search prefix J at Node 0-0 represents a match with the search key 171 because 171 (i.e., 10101011b) is a match with 128/2 (i.e., 10XXXXXX), where X represents a “don't-care” value. The search then proceeds to Node 1-1 (i.e., along a right-side branch from Node 0-0 to Node 1-1) because 171 is greater than 128. No matches are present at Node 1-1 because the search key 171 (i.e., 10101011b) does not match either the search prefix R:170/8 (10101010b) or the search prefix H:224/3 (i.e., 111XXXXX). Because the search key 171 is greater than 170 and less than 224, the search then proceeds to and terminates at Node 2-5, which is a leaf node of the b-tree 30. None of the search prefixes U:192/2, G:192/3 or K:208/5 at Node 2-5 represent a match with the search key 171. Thus, based on the illustrated search path, which traverses Nodes 0-0, 1-1 and 2-5 of the b-tree 30, only search prefix J:128/2 represents a matching entry with the search key 171. However, as illustrated best by FIG. 2, the search prefix Q:168/6, which resides at Node 2-4 of FIG. 3, actually represents the longest prefix match with the search key 171, yet this search prefix was not within the search path and was not detected during the search operation. Moreover, the search prefixes A:0/0, L:128/1 and N:160/4 also represent matches that were not within the search path. This means that the conventional sorting of prefixes within the b-tree 30 of FIG. 3 will not yield correct results for all search keys. To address this limitation associated with the b-tree 30 of FIG. 3, span prefix masks have been used to support accurate longest prefix match search operations. These masks are described more fully in commonly assigned U.S. application Ser. No. 11/184,243, filed Jul. 19, 2005, the disclosure of which is hereby incorporated herein by reference.
Another example of a b-tree data structure is described in U.S. Pat. No. 6,490,592, which is assigned to Nortel Networks Limited. As described at Col. 1 of the '592 patent, conventional b-tree data structures may not be well suited for search operations that require identification of longest prefix matches (LPMs) within the b-tree data structure. To address this limitation, the '592 patent describes a modified b-tree data structure that is arranged so that data elements stored therein, which have no overlapping prefixes, are arranged in a standard b-tree structure. However, other data elements that have overlapping prefixes are arranged in a modified structure so that the prefix of such a data element contains the prefixes of all such data elements that succeed it in the b-tree. This modified structure is referred to as an L-structure. FIG. 3 of the '592 patent shows portions 300 and 340 that comprise a b-tree into which an L-structure 320 is inserted. Unfortunately, the use of L-structures within a b-tree may represent a form of prefix nesting that reduces a likelihood of achieving preferred b-tree properties that can reduce search latency and result in efficient utilization of memory space. In particular, for a fixed memory capacity and latency, which is related to tree height, the number of search prefixes that can be supported within the b-tree of the '592 patent is statistically dependent on the degree of nesting within the prefix data set supported by the b-tree. Accordingly, prefix data sets that require a high degree of nesting may result in an inefficient utilization of the memory space that is required to maintain the b-tree.
An additional type of b-tree data structure includes a b*tree data structure, which can require non-root nodes to be at least ⅔ full at all times. To maintain this fill requirement, a sibling node is not immediately split whenever it is full. Instead, keys are first shared between sibling nodes before node splitting is performed. Only when all sibling nodes within a group are full does a node splitting operation occur upon insertion of a new search key. FIG. 4 illustrates a conventional three-level b*tree data structure of ¾ efficiency (i.e., N/(N+1)=3/4), having four key locations per node (i.e., M=4). These three levels are illustrated as L0, L1 and L2, where L0 is treated as the root level and L2 is treated as a leaf level. Level L1 is an intermediate level, which is a child relative to the root level and a parent relative to the leaf level. As will be understood by those skilled in the art, a b*tree of type N−(N+1) (i.e., 2-3, 3-4, 4-5, . . . ) requires all non-root nodes to be between N/(N+1) to 100% capacity (i.e, 67%, 75%, 80%, . . . up to 100%) before and after an insert or delete operation has been fully performed.
FIG. 5A illustrates a portion of a b*tree with excess capacity having three sibling nodes at a leaf level and a parent node at the root level containing the search keys A-K, which represent numeric search key values. The leftmost sibling node contains the search keys A, B and C, the middle sibling node contains the search keys E, F and G and the rightmost sibling node contains the search keys I, J and K. The parent node contains the search keys D and H. These sibling nodes are at 75% capacity, which meets the requirement that all non-root nodes be between N/(N+1) to 100% capacity for a 3-4 type b*tree, where N=3. As illustrated by FIG. 5B, an insertion of the key L into the b*tree of FIG. 5A increases the rightmost sibling node to full capacity without affecting the other two sibling nodes. The additional insertion of key M into the rightmost sibling node in the b*tree of FIG. 5B causes the transfer of key Ito the parent node and the transfer of key H from the parent node to the middle sibling node, as illustrated by FIG. 5C.
FIG. 5D illustrates the further insertion of node N into the rightmost sibling node, which causes an overflow that ripples through the parent and middle sibling nodes into the leftmost sibling node, which is now at full capacity. In FIG. 5E, a split between the sibling nodes and an increase in population of the parent node occurs in response to the further insertion of key O into the rightmost sibling node. This split from three to four sibling nodes is necessary to maintain a capacity of all non-root nodes in a range from 75% to 100% capacity, for N=3.
FIGS. 6A-6D illustrate three insertion examples that result in the splitting of sibling nodes having no excess capacity. As illustrated by FIG. 6A, the insertion of any additional key (#) into a b*tree with sibling nodes at full capacity results in a split among the sibling nodes and a repopulation of these nodes at equivalent levels (shown at 75%). In FIG. 6B, the insertion of key D+ into the leftmost sibling node results in a split that causes keys D, G and K to move to the parent node (displacing keys E and J) and a grouping of keys D+, E and F together in a sibling node. In FIG. 6C, the insertion of key I+ into the middle sibling node results in a split that causes keys D, H and K to move to the parent node (displacing keys E and J) and a grouping of keys I, I+ and J together in a sibling node. Finally, in FIG. 6D, the insertion of key N+ into the rightmost sibling node results in a split that causes keys D, H and L to move to the parent node (displacing keys E and J) and a grouping of keys M, N and N+ together in a rightmost sibling node. Thus, as illustrated by FIGS. 6B-6D, the value of the search key to be inserted into sibling nodes having no excess capacity influences the nature of the overflow and regrouping of keys during an operation to split the sibling nodes. This means that conventional hardware to perform insert operations may need to account for every possible insert location that may occur amongst the plurality of sibling nodes.