B-tree is a commonly used on-disk data structure in file systems and storage systems. B-tree stores key-value pairs and supports efficient create, read (lookup), update, delete, and range scans operations. The keys of the key-value pairs in a B-tree are usually fixed in size. However, the values of the key-value pairs in a B-tree are often variable in size when used in file systems, such Virtual Distributed File System (VDFS) and B-tree file system (Btrfs). Unfortunately, introducing variable size values to a B-tree significantly increases the complexity of the leaf node disk layout.
On one extreme, the use of variable size values can eliminate the need for free space management since memory can be moved and leaf nodes can be compacted on every B-tree update. Btrfs uses this approach for a leaf node in which a fixed size index, where the keys are located, is at the beginning of the node, while the values are at the end of the node. This approach is simple to implement but it causes excessive memory movement because all values must be repacked before the node is written out to disk. The extra CPU cost spent on such memory movement significantly reduces B-tree update performance.
On the other extreme, it is possible to use a bitmap to manage free space and reduce memory movement to the minimum. However, a bitmap takes up space, introduces complexity and costs extra CPU time, which are significant reasons not to use bitmap allocations in a relatively small region of a B-tree node.
Throughout the description, similar reference numbers may be used to identify similar elements.