A database management system controls access to databases containing many records of data. For many applications there is a need to search such data for a record matching a key. A sorted index can be built using a tree structure. Two of the tree structures commonly used are the B-tree and the digital tree. A general introduction to the B-tree and the digital tree can be found in "The Art of Computer Programming", Knuth, volume 3, "Sorting and Searching", pages 473-499.
In a B-tree nodes are connected by a series of pointers. The nodes hold a series of ordered key numbers and the nodes at any one level are ordered themselves. Searching is done by moving from a root node at the top of the tree down to the next lower level in the hierarchy via a pointer and performing a comparison between the search key and the key numbers at the node. This is repeated until a leaf node (or descriptor) containing just keys corresponding to individual records is reached.
In a digital tree the key is represented by a sequence of digits which are tested at successive lower ranking nodes. As in the B-tree, nodes are connected by pointers, but the nodes have a leading key value for the digits of the key tested at higher level nodes and pointers to lower levels corresponding to each possible digit value of the digit currently being tested. In a B-tree, re-arrangement of the tree may be necessary after insertion or deletion of a node, since all nodes must have between n/2 and n keys associated with them, where n is the order of the B-tree. This may ripple all the way up a tree when a change is made at a leaf node. Re-arrangement in a digital tree is only necessary when an entry at a node is the only entry left at that node and then the re-arrangement is a very simple one. A digital tree does however use more storage for a given set of key referenced data entries than a B-tree.
Concurrent access to the records of data has, in the past, been achieved by a number of techniques such as locking the database to prevent information from being changed while it is being retrieved or causing a process to wait until the database is free. However previous techniques in this area have generally suffered from at least one of the following limitations:
1. Only random access to specific keyed information identified by a full length key was supported (for example using a hash table).
2. Reading and writing tasks had to use some form of locking technique to prevent information from being changed while it is being retrieved. This violates the requirement of read-only storage access for reading processes, and means that a reading process has to have some method of suspending itself to wait for information to become available.
3. The index structure was similar to a list or unbalanced binary tree and could only provide efficient access to small amounts of data, because the overheads increased approximately linearly with the number of entries in the index.