A file system or database to store large amounts of data may incorporate a data structure provided to organize data for various usages, such as symbolic links, databases, file systems and the like. One such data structure is a B-tree. The B-tree may be optimized for systems that read and write large blocks of data.
As used herein, “B-tree” also means B+tree, B*tree, Foster B-trees, dancing trees, and other balanced tree data structures that maintain strict height balance and with node sizes above two which vary between a set maximum and minimum of half the max or greater in the B-tree's persistent form.
According to various uses and implementations of a B-tree, a size of a B-tree may become larger than optimal. A large B-tree may present issues such as exceeding available storage space, or making operations and searches on a B-tree burdensome and timely. Further, operating and searching a smaller B-tree may provide a faster and convenient experience for a user of a file system that integrates a B-tree data structure. In addition, backing up, taking snapshots of, or relocating a B-tree can become difficult if the size is too large.
Thus, in cases where a B-tree has exceeded a threshold size, the file system may partition the B-tree into multiple B-trees. This partition usually is done in a static fashion. In one such implementation, several servers or locations are already provided. Thus, data starting with a value ‘a’ may go to server 1, while data starting with a value ‘b’ may go to server 2, and so on. This implementation is hindered by the inability to deal with imbalanced data sets. For example, if the data or pointers to be stored are URL information, the server dedicated to the data set starting with “w” fills up quicker than the other servers.
In another implementation, the B-tree may be implemented with internal pointers for each node, and internal pointers may facilitate reorganization. However, this implementation is invasive to the data structure and complicated to design and implement.