1. Technical Field
The present disclosure relates to b-trees and more specifically to concurrent access of b-trees.
2. Introduction
Computers are relied upon to store and offer access to large amounts of data. Accordingly, being able to access the data faster and more efficiently is an ongoing goal of modern developers. To achieve this goal, computing data structures have been developed to achieve this end.
One data structure which has been commonly used to manage data is a binary tree. Binary trees store data in nodes connected in a tree structure. Each tree begins with a single root node that stores a single data element and can have no more than two child nodes. The child nodes are commonly referred to as the left child and right child. Each child node can likewise store one data element and have two child nodes. Data is stored in a binary tree using the value in each node as a key. For example, if a binary tree holds integers as values, the tree is organized such that each integer is stored in a node to the left of a node containing a larger integer, but to the right of a node containing a smaller integer. This way the contents of each node of the tree can be used as a key to quickly traverse the tree and find data.
Many variations of the binary tree have been developed. One variation that is commonly used when managing very large amounts of data is a b-tree. As known to a person of ordinary skill in the art, a B-tree (referred to as “b-tree” throughout this document) is a variation of the binary tree that allows for multiple keys and children per node. Some b-trees can be configured to have hundreds of keys and children per node, thus having millions of nodes in a tree with a fairly short depth. These types of large b-trees are commonly used by file-systems to represent files and directories.
One variation of a b-tree has been developed by Ohad Rodeh and has been disclosed in his paper B-trees, Shadowing, and Clones (ACM Transactions on Computational Logic, Vol. V, No. N, August 2007), which is incorporated by reference, herein, in its entirety.
To increase the speed and efficiency of accessing data within a b-tree, concurrent access to a tree can be granted to multiple functions. Granting concurrent access, however, can lead to errors if a node is modified while being accessed by another node. To alleviate this problem, multiple locking schemes have been used to limit errors while allowing as much concurrent access as possible.
In the past, different types of locks, providing different levels of security, have been used. For example, functions have been differentiated between functions that modify the tree (write function) versus those that only request data from the tree (read function) and different types of locks have been configured to restrict access from certain types of functions attempting to access a node. For example, when a read function accesses a node, a read lock can be placed on the node which allows only other read functions to access the node concurrently and restricts access to all write functions. When a write function accesses a node, a write lock can be placed which restricts access to all other functions, both read and write.
One common solution to allowing concurrent access to a b-tree is to lock each node as it is traversed by the function. For example, in some embodiments, for every node traversed by a read function, a read lock is applied to the node. Conversely, for every node traversed by a write function a write lock can be applied to the node. This solution, although effective, is inefficient and slow because each function must enter the tree by the root node and so the root node must always be locked when the tree is accessed by a function. In the case of a write function, the result is that no other functions can access the tree until the write function has completed and the lock is removed.
To remedy this problem, the b-tree disclosed by Ohad Rodeh incorporates a lock coupling technique wherein individual nodes are locked as they are traversed and can then be released after the appropriate child node is locked if it is determined that the parent node will not be modified. Similar to the method described above, a read lock is used when the tree is traversed by a read function while a write lock is used when the tree is traversed by a write function.
The lock coupling method is beneficial because a node is not locked unless it is being accessed, or it is likely that the node will be modified. The root node, therefore, is often not locked when the tree is accessed by a function and the tree can therefore be accessed by both reader and writer functions concurrently.
Lock coupling does provide a more efficient system of allowing concurrent access to a tree; however the restrictive nature of a write lock still fails to allow sufficient concurrent access and ultimately impedes performance. Accordingly, a need exists for a less restrictive locking technique associated with concurrent access to b-trees that still provides adequate protection against errors.