1. Technical Field
The invention relates generally to PATRICIA tries. More specifically the invention relates to the addition of special annotation nodes to the tries.
2. Discussion of the Prior Art
The Practical Algorithm To Retrieve Information Coded In Alphanumeric, or PATRICIA, is a trie shown by D. R. Morrison, in 1968. It is well known in the industry as a compact way for indexing and is commonly used in databases, as well as in networking technologies. In a PATRICIA implementation, trie nodes that have only one child are eliminated. The remaining nodes are labeled with a character position number that indicates the nodes' depth in the uncompressed trie. FIG. 1 shows an example of such an implementation of a PATRICIA trie for an alphabetical case. The words to be stored are “greenbeans”, “greentea”, “grass”, “corn”, and “cow”. The first three words differ from the last two words in the first letter, i.e. three begin with the letter “g” while the other two begin with the letter “c”. Hence, there is a difference at the first position. Therefore, there is a node at depth ‘0’ separating the “g” words from the “c” words.
Moving on the “g” side, the next time a difference is found is in the third position where two words have an “e” while one word has an “a”. Therefore, a node at that level will indicate a depth level of ‘2’.
Continuing down the left path reveals that the next time a different letter is found is at the sixth position where one word has a “b” while the other has a “t”. Therefore, there is a node at depth ‘5’.
The problem with this implementation is that keys are no longer uniquely specified by the search path, and hence the key itself has to be stored in the appropriate leaf. An advantage of this PATRICIA implementation is that only about s*n bits of storage are required, where s is the size of the alphabet and n is the number of leaves.
An alphabet is a group of symbols, where the size of an alphabet is determined by the number of symbols in the group. That is, an alphabet having an s=2 is a binary alphabet having only two symbols, possibly ‘0’ and ‘1’ FIG. 2 shows an implementation for such an alphabet. A binary alphabet makes it possible to overcome the restriction of storing only the string values in a trie because other data types may be represented as a string of bits.
A PATRICIA trie is either a leaf L(k) containing a key k or a node N(d, l, r) containing a bit offset d≧0 along with a left sub-tree l, and a right sub-tree r. This is a recursive description of the nodes of a PATRICIA tree, and leaves descending from a node N(d, l, r) must agree on the first d−1 bits. A description of PATRICIA tries may be found in Bumbulis, Bowman, A Compact B-Tree, Proceedings of the 2002 ACM SIGMOD international conference on management of data, pages 533-541, which document is incorporated herein in its entirety.
Using the PATRICIA trie architecture, a block of references may be prepared that points to the data stored in a permanent storage, for example disk-based data tables. A prefix-based PATRICIA trie structure may be used for expressing hierarchical relations between the data elements. Given sample hierarchies A→B→C, A→B→D, E→F→G, and E→F→H one may construct compound keys by concatenating the keys of the respective elements A∥B∥C, A∥B∥D, E∥F∥G and E∥F∥H. Inserting the resulting compound keys in a PATRICIA trie produces a trie that reflects the hierarchy of the original data elements as depicted in FIG. 3.
It would be therefore beneficial to take advantage of the inherent hierarchical nature of the PATRICIA trie and provide a method and apparatus for the association of annotations of a particular value or function with a group of keys that belong to a hierarchy.