The disclosed embodiments relate to database optimization and, in particular, to methods, systems, and apparatuses for optimizing log-structure merge (LSM) tree-based databases using object solid state drive (SSD) devices.
An LSM tree is a data storage architecture where data is stored as key-value pairs across multiple storage structures. Using a two-level LSM tree as an example, a first tree structure (C0) stores key-value ranges in, for example, local memory while a second tree structure (C1) is stored on a persistent storage device. New records are only inserted into C0, which is fixed in size. If the insertion of a new record would cause C0 to exceed the fixed size of the structure, a continuous range of key-value pairs is remove from C0 and inserted into C1. The movement of data from a lower level (e.g., C0) to a higher level (e.g., C1) is referred to as compaction.
LSM trees are well-suited for key-value databases such as ROCKSDB (created by Facebook, Inc. of Menlo Park, Calif.) or LEVELDB (created by Google, Inc. of Mountain View, Calif.). Since these databases use keys as the fundamental write/retrieval index, the logical data model of the database interface maps explicitly to the LSM tree structure. For this reason, LSM trees are commonly used in such key-value databases to improve compression rates, less input/output (I/O) capacity required for persisting changes, and a simpler code base.
Despite these improvements, compaction operations of the LSM tree result in inefficiencies when used with storage devices. Specifically, the compaction operations are computationally expensive due to the number of reads required for several key-value ranges to enable sorted insertion. This computation is compounded by the underlying physical storage mechanism which also performs its own routine maintenance. For example, solid-state Flash drives employ in asynchronous garbage collection routine that ensures that invalid pages are removed from the underlying Flash blocks. During compaction, since key ranges may not all be stored on a single block (or series of blocks), the erasure of a key range usually results in invalid pages being left on the underlying SSD. Thus, the compaction process additionally includes the overhead of the underlying SSD garbage collection process, which increases latency when accessing the SSD (due to the garbage collection process temporarily halting reads and writes).