Applications that rely on transactional semantics, such as databases, key-value stores, file systems, and the like, typically make use of transaction logging (also known as journaling) to ensure data consistency in the face of system crashes/failures. In a conventional transaction logging implementation, an application records all of its transactions in a singular write-ahead/append-only log that is stored on nonvolatile storage (e.g., a magnetic hard disk or solid-state disk (SSD)). The “append-only” qualifier means that log entries are continually added to the end of the log as transactions occur. Thus, the log captures the entire history of transactions that have been processed by the application since the last log initialization or compaction. If the application's host system crashes or otherwise fails, the entries in the log are replayed, from first to last, to bring the storage or memory on which the application data resides into a transactionally consistent state (note that some applications, such as log structured file systems, can use the log for storing its data/metadata and thus do not need to implement a replay mechanism).
While the approach of using a singular write-ahead/append-only log for transaction logging is functional (and is suited to the performance characteristics of conventional nonvolatile storage devices), it also suffers from a number of drawbacks. First, as indicated above, recovery after a system crash or failure generally requires the entirety of the log to be replayed (due to batching of log entry and/or application data commits). This can make the recovery process a time-consuming task, particularly for applications that deal with very large data volumes. Second, since the log is append-only and will continue to grow in size as new transactions are processed, there is a need to compact the log on a periodic basis so that it does not consume all of the available space on nonvolatile storage. Although there are various methods to perform this compaction, all of these methods consume CPU/memory resources and incur throughput/latency degradation, resulting in unpredictable and non-uniform performance. Third, the fact that all transactions are recorded in a single sequential log means that one malformed or buggy transaction can potentially corrupt the log entries for other transactions, thereby damaging the entire transactional history of the system.