Computer systems typically include memory and disk. Generally, accessing data stored in memory is much faster than accessing data stored on disk. Memory, however, is much more expensive than disk. So, for a given computer system, memory will typically be much smaller than disk. Thus, data storage and access algorithms have been developed to manage the interaction between memory and disk.
A b-tree is one type of algorithm that is commonly used to store and retrieve records between memory and disk. A b-tree index includes a root node at the top of the tree, leaf nodes at the bottom of the tree, and intermediate nodes between the root and leaf nodes. Records are stored in the leaves of the tree. The records can be found by passing down through the various nodes of the tree.
In a typical b-tree application, there can be thousands, millions, or even billions of records. The depth of a b-tree can be proportional to the number of records. In a b-tree implementation, the upper portion of the tree may be kept in memory and traversing through each layer of the tree can result in a disk access. B-trees suffer from being disk bound when their top level exceeds available memory. As discussed above, accessing disk is time consuming because a disk has mechanical parts that are used to locate and access data. The overhead can be particularly acute during, for example, backup operations when the whole filesystem may be traversed.
Thus, there is a need for a more efficient data structure and method of data access that has small memory requirements and few disk access requirements.