Availability guarantees form an important part of many service level agreements (SLAs) for production database systems: minimizing database downtime has economic as well as usability benefits. Database systems can crash for a variety of reasons, including software bugs, hardware faults and user errors. Many of these conditions are transient, and for these, restarting after logging the error is a reasonable approach to recovery. In these cases, database restart time has a significant and direct impact on database availability.
In-memory database management systems (DBMSs) recover by rebuilding dynamic random access memory (DRAM) based data structures from a consistent state persisted on durable media. The persisted state typically consists of a copy (“checkpoint”) of the database state at a particular instant in time, and a log of any subsequent updates. Recovery consists of reloading portions of the checkpointed state, applying subsequent updates and then undoing the effects of unfinished transactions. Database restart time not only includes the time to recover the consistent state as it existed before the crash, but also the time to reload any data required by the current workload. This second component can be substantial for OLAP workloads.
What is needed is a form of non-volatile memory that has high database performance of throughput and response time, while also providing improved restart performance, which is a shortcoming of traditional main memory database systems.