Computer processors and associated memory components continue to increase in speed. As hardware approaches physical speed limitations, however, other methods for generating appreciable decreases in data access times are required. One method is via effective data management, achieved by the appropriate choice of data structure and related storage and retrieval algorithms. For example, various prior art data structures and related storage and retrieval algorithms have been developed for data management including arrays, hashing, binary trees, AVL trees (height-balanced binary trees), b-trees, and skiplists. An enhanced storage structure is further described in pending U.S. patent application Ser. No. 09/457,164, filed Dec. 8, 1999, entitled xe2x80x9cA Fast, Efficient, Adaptive, Hybrid Treexe2x80x9d, assigned in common with the instant application and incorporated herein by reference in its entirety.
While such data structures accommodate storage, searching and retrieval of data, they do not readily support other operations including, for example, providing a count of a number of data entries or keys (e.g., indices) falling within selected ranges or xe2x80x9cexpansesxe2x80x9d of the structure. Nor do such constructs readily support identification of keys or indices based on their ordinal value among the stored data, or the retrieval of data based on an ordinal value or range of ordinal values of their associated keys or indices. Instead, these prior art structures require partial or complete traversal of the data to provide a count of values satisfying specified ordinal criteria. Accordingly, a need exists for a data structure which supports identification of keys and indices and data based on ordinal values within a set and further provides a count of values based on ranges of key (or index) values.
The present invention is a data structure and related data storage and retrieval method that rapidly provides a count of elements stored or referenced by a hierarchical structure of ordered elements (e.g., a tree), access to elements based on their ordinal value in the structure, and identification of the ordinality of elements. In an ordered tree implementation of the invention, a count of elements stored in each subtree is stored, i.e., the cardinality of each subtree is stored either at or associated with a higher level node pointing to that subtree or at or associated with the head node of the subtree. In addition to data structure specific requirements (e.g., creation of a new node, reassignment of pointers, balancing, etc.) data insertion and deletion includes steps of updating affected counts. Elements may be target data itself (e.g., data samples, prime numbers); keys or indices associated with target data (e.g., social security numbers of employees, product numbers and codes, etc. used to reference associated data records, etc.); or internal memory pointers to keys and/or data stored outside the data structure. While the invention is applicable to varied hierarchical storage structures including, for example, binary trees, AVL trees (height-balanced binary trees), b-trees, etc. (population based structures) and digital trees (i.e., triesxe2x80x94expanse based structures), a preferred embodiment of the invention incorporates a hybrid tree structure as described and set forth in above referenced U.S. Patent Application.
According to an aspect of the invention, a computer memory is configured to store data for access by an application program being executed on a data processing system. Stored in memory is a hierarchical data structure, the data structure storing an ordered set of keys. The structure includes a root node and a plurality of first level data structures, a subset of the ordered set of keys uniquely associated with respective ones of the first level data structures. Each of the first level data structures have associated therewith a count value representing a number of entries of an associated one of the subsets. The entries may correspond to the keys, particularly in those structures wherein keys must be unique.
According to an aspect of the invention, the hierarchical data structure may be a digital tree, or xe2x80x9ctriexe2x80x9d or similar xe2x80x9cexpansexe2x80x9d based data storage structure. Conversely, a feature of the invention includes xe2x80x9cpopulationxe2x80x9d based structures, such as b-trees and the various types of binary trees.
According to another feature of the invention, the count values are stored in memory in association with the root node, the root node including addresses of each of the first level data structures. Each of the addresses may stored in memory in association with the root node as a pointer originating at the root node and terminating at a respective one of the first level data structures.
According to another feature of the invention, each of the first level structures further includes a plurality of directors (e.g., pointers or directed edges) to respective second level data structures and/or nodes. The first level data structures may further include interior nodes referencing other nodes and leaf nodes containing or referencing the keys.
According to another aspect of the invention, a computer memory for storing data for access by an application program being executed on a data processing system includes a hierarchical data structure stored in memory. The data structure stores an ordered set of keys and includes a head node addressing each of a first plurality of first level data structures. Each of the first level data structures, in turn, address respective second level data structures. First level nodes of the ordered set of keys are uniquely associated with respective ones of the first level data structures while second level nodes are uniquely associated with respective ones of the second level data structures. Each of the first and second level data structures have associated therewith a count representing a number of the keys stored in respective ones of the structures.
According to a feature of the invention, each of the first level nodes includes references to at least two of the second level nodes. Further, the counts may be associated with a number of the keys referenced by respective ones of the references.
According to another feature of the invention, the references include addresses of the second level nodes in the memory. For example, the references may be in the form of pointers to the second level nodes.
According to another aspect of the invention, a method of storing data in a computer memory includes storing ordered sets of keys into a plurality of data structures. Addresses of the data structures are stored in a root node and counts of one of the keys in each of the data structures are stored in association with each of the addresses.
According to a feature of a method according to the invention, a step of determining an ordinality of one of the keys includes adding at least one of the counts to an ordinality of the key with respect to others of the keys commonly stored in one of the data structures.
According to another aspect of the invention wherein the data structures include at least one first level data structure referencing a plurality of second level data structures, the method further includes a step of distributing the keys among the plurality of second data structures and storing in the first level data structure counts of the keys in each of the second level data structures.