Main memory database systems maintain a simple but potentially large, in-memory data structure that holds the data of the database. This type of database is typically between about 1 Gigabyte and about 4 Gigabytes in size, and cannot be larger than the memory of the particular computer because the database is in-memory. The memory utilized is standard RAM in standard servers.
in order to be able to recover such a database after a crash, the database maintenance system needs to ensure that all the information contained in memory is also on disk as a backup. Accordingly, the database maintenance system logs updates to the database. In other words, every time a change happens to the database, the database maintenance system logs the change into the log. When the database is restarted, the database maintenance replays the log. As can be imagined, after a relatively short time, the log can grow to be extremely long. Every time the database is rebooted, the database start time is longer than the previous reboot as the log grows.
To address this problem, the database maintenance software periodically takes snapshots of the database in order to reduce the recovery time when the database needs to be restarted. For example, after every 10,000 transactions or so, the database maintenance takes a snapshot of the database. After each snapshot, the database maintenance software starts a new log. Otherwise, without the snapshots, the database log would grow without bounds and the startup time would be proportional to the size of the database log. Using this snapshot method, when the database maintenance system needs to do a recovery, the system returns to the last snapshot and applies the log entries. The database maintenance system may thereby quickly recover a database. Continuing with the example above, at most, the database maintenance system has to restore a snapshot and replay 10,000 transactions.
Unfortunately, in order to take a snapshot, the database maintenance system must lock the database, write the entire database onto disk and unlock the database in order to start updating the database again. Meanwhile, real world implementations of databases are relatively large. Accordingly, as the database grows, it takes longer and longer to snapshot. So, the time to take a snapshot is much higher than the minimum request latency of updates to the database. For example, a database over about one Gigabyte requires a time for a lock that is too long for practical implementations of the database. For this reason, the database maintenance system cannot lock the database to take a snapshot of the database.
To address these problems, a proposed solution is to use some form of copy-on-write or partial locks. Copy-on-write is an optimization strategy used in computer programming. The fundamental idea is that if multiple callers ask for resources which are initially indistinguishable, you can give them pointers to the same resource. This fiction can be maintained until a caller tries to modify its “copy” of the resource, at which point a true private copy is created to prevent the changes becoming visible to everyone else. All of this happens transparently to the callers. The primary advantage is that if a caller never makes any modifications, no private copy need ever be created. A database maintenance system uses the copy-on-write concept hi maintenance of instant snapshots on database servers like Microsoft® SQL Server® 2005. As discussed above, instant snapshots preserve a static view of a database by storing a pre-modification copy of data when underlying data is updated.
Unfortunately, copy-on-write complicates certain implementations of database maintenance software. Also, partial locks still introduce unwanted latencies.