Although an ideal computer system would take the same amount of time to process any individual query, the real world is seldom that perfect. When plotted as a graph comparing number of queries vs. latency—that is, the time required to complete the query, the graph would show some queries answered in a relatively short amount of time, whereas other queries take a relatively long amount of time. These data points that are at the far end of the graph will likely exist regardless of the shape of the graph. As the number of queries that take relatively long amounts of time are on the far end of the graph and typically tail off toward zero, the time required to answer these queries with a high latency are often termed “tail latency”.
There are any number of reasons why computer systems may experience tail latency. For example, if needed data is typically cached in a high speed cache but some data is stored in a (relatively) slow longer term storage (such as a hard disk drive), queries that require the data stored in the longer term storage frequently will be slower than requests for data stored in the high speed cache. Another reason for tail latency may be writing data to longer term storage. Writing data may take longer than just reading data: for example, even if only one byte is being changed in the data, when writing data to a Solid State Drive (SSD) an entire block must be written. Background operations may also delay the time required to complete a query. For example, SSDs perform garbage collection operations to identify blocks that may be erased (and which might require some data to be programmed to other blocks). If a garbage collection operation is underway when a query arrives, the query may have to wait for the garbage collection operation to complete before the query may be satisfied. This delay due to garbage collection may affect the tail latency of queries.
Like other statistics, tail latency may be measured as a percentage of the overall performance. For example, the term “5% tail latency” may refer to the 5% of queries that have the largest overall latency, whereas “1% latency” may refer to the 1% of queries that have the largest overall latency.
In modern computer database systems, the 1% tail latency of the system is a critical issue. 1% tail latency may decide service quality in the worst case. Modern databases, such as BigTable, HBase, LevelDB, MongoDB, SQLite4, RocksDB, Wired Tiger, and Cassandra, use log structured merge (LSM) trees in order to manage data. LSM trees may have poor 1% tail latency even though they show a good performance in general. The response time from the database cache may be excellent, but the response time from a SSD may be bad due to the large size of data to be written to the SSD, and the response time from storage with garbage collection may produce the worst performance, regardless of TRIM support. Garbage collection is a major source for the 1% tail latency: SSDs may not avoid performing garbage collection. In addition, when databases use LSM trees, sometimes a large database flush may occur and trigger 1% tail latency, especially when this database flush operation works concurrently with garbage collection.
A need remains for a way for to improve the tail latency of SSDs.