This invention relates to search engines for searching large tables of data, and particularly to search engines used for perfect matching to keys and addresses.
Lookup procedures are a major source of bottlenecks in high performance compilers and routers. One type of lookup procedure is known as the perfect match lookup, technique. Perfect match lookups techniques are used in various compilers such as compilers used in designing: semiconductor and integrated circuit chips, and in networking applications such as Internet address (URL) lookup in high performance data routers where large search tables or databases are employed. Searching large tables or databases requires increased time or hardware requirements, or both, resulting in more expensive systems in terms of increased search delays and larger memories. Problems associated with such lookups increase with the size of the search tables or databases, increases in traffic, and introduction of higher speed links. Moreover, perfect key or address matching is particularly challenging where large tables must be searched for the perfect match of the key or address.
In the past, perfect match lookups were performed using hash tables. The principal disadvantage of the hashing approach is the unpredictability of the delay in performing a seek operation. Typically, longer addresses require more time than shorter addresses, rendering the seek delay unpredictable. Moreover, as the table size increases, the delay increases. While the delay may be minimized by employing larger memory for the hashing operation, the delay is nevertheless unpredictable.
More recently, certain data structures, such as content addressable memory (CAM), have been used because of their capability to handle lookup techniques. A search table containing entries of keys (addresses) and data is used with a mask such that an input key or query operates on the mask to lookup the associated key (address) of the sought-for data. While this technique is quite effective, hardware and processing requirements limit expansion of this technique as tables increase in size, or as traffic increases.
Balanced binary search tree architecture has been-proposed to establish a predictable delay in connection with the searching operation. One particularly attractive balanced tree architecture is the red-black tree described by T. H. Corman et al. in xe2x80x9cIntroduction to Algorithmsxe2x80x9d, published by The MIT Press, McGraw-Hill Book Company, 1989 in which an extra bit is added to each node to xe2x80x9cbalancexe2x80x9d the tree so that no path is more than twice the length as any other. The red-black balanced binary tree architecture is particularly attractive because the worst-case time required for basic dynamic set operations is 0(log n), where n is the number of nodes or vertices in the tree. The principal difficulty with balanced tree approaches is that complex rotations and other operations were required to insert or delete an entry to or from the tree to assure the tree complied with the balancing rules after insertion or deletion.
The present invention is directed to a data structure and a sorted binary search tree that inherits the favorable attributes of the balanced binary search tree, but provides simpler solutions for the insertion and deletion functions.
According to one aspect of the present invention, a binary search tree is structured so that keys associated with data are arranged in a predetermined order in the vertices of each level of the tree. The tree has a plurality of levels with a plurality of vertices in the bottom and at least one hierarchy level. A top level contains a root vertex defining an input to the tree. The keys are distributed through the vertices of each level in a predetermined order.
In one form of the tree, the keys are arranged in order of value, and the hierarchy and root vertices contain the one key from each respective child vertex having a minimum value. Thus, the bottom vertices are arranged in an order ascending from 1 to V, where V is an integer equal to the number of bottom vertices, and the keys in the bottom vertices are arranged so that values of the keys in any one bottom vertex are greater than values of the keys in all lower-ordered bottom vertices and are smaller than values of keys in all higher-ordered bottom vertices. The keys in the hierarchy vertices are similarly arranged.
Another aspect of the invention resides in a process for altering the binary search tree. The number of keys in at least one bottom level vertex is altered, such as by deleting or inserting a key. The number of keys remaining in the altered bottom level vertex is identified. If the number of remaining keys is less than k, where k is an integer xe2x89xa72, such as where a key was deleted, the location of keys among the bottom vertices is adjusted until all bottom vertices contain no less than k keys. If the number of remaining keys is greater than 2kxe2x88x921, such as where a key was inserted, a key is transferred from the adjusted bottom vertex to another bottom vertex until all bottom vertices contain no more than. 2kxe2x88x921 keys.
Where a key is deleted from a vertex to leave less than k keys, a neighboring bottom vertex is identified that contains more than k keys and a key is transferred from the identified neighboring bottom vertex to the adjusted bottom vertex. If no bottom vertex is identified as containing more than k keys, the keys remaining in the adjusted bottom vertex are transferred to at least one neighboring bottom vertex so that the number of keys in the neighboring bottom vertex contain no more than 2kxe2x88x921 keys. The bottom vertex from which the key was deleted is then itself deleted. Similarly, the locations of keys among the hierarchy vertices are adjusted until all hierarchy vertices contain no less than K keys, where K is an integer xe2x89xa72. While K or k is constant for all vertices in a given level, they may be different for vertices of different levels.
Where a key is inserted into a bottom vertex causing the receiving vertex to contain more than 2kxe2x88x921 keys, a neighboring bottom vertex is identified that contains less than 2kxe2x88x921 keys and a key is transferred to-the neighboring bottom vertex from the bottom vertex containing the inserted key. If all neighboring bottom vertices contain 2kxe2x88x921 keys, a new bottom vertex is created and key are transferred to the new bottom vertex from the bottom vertex containing the inserted key until the number of keys in the bottom vertices is between k and 2kxe2x88x921. A similar process is employed to add new hierarchy vertices.
According to another aspect of the invention, a computer useable medium contains a computer readable program comprising code that defines the structured binary search tree and cause the computer to reconstruct the tree upon insertion and deletion of keys.