A checkpoint is a point in virtual machine (VM) execution where the VM state is consistent. It occurs at instruction boundaries with no in-flight device input/output operations (IOs). The ability to save and restore the state of a running VM at a checkpoint, also referred to as checkpointing, is a key feature provided by virtualization. U.S. Pat. No. 6,795,966, incorporated by reference herein in its entirety, describes the checkpoint process for a VM in detail.
For large VMs, both the checkpoint save process and the checkpoint restore process can take a very long time. In order to reduce the downtime during checkpoint save and speed up checkpoint restore, a checkpoint process may be carried out in a manner that allows the VM state to be saved or restored “lazily.” Lazy checkpointing minimizes the time that the VM is not running, by either writing out or reading in its state while the VM is executing instructions. When lazy checkpointing is implemented, write traces are installed on each page of VM memory. When the VM writes to a traced page, the write trace is removed and the contents of the page are written out to a checkpoint file before the page is modified with the write. During a lazy checkpoint restore process, the VM is allowed to start running even before its entire state has been loaded into memory from the checkpoint file. As the VM executes instructions that generate accesses to memory, the pages that are not in memory are faulted in from the checkpoint file.
Some checkpoint techniques either track accesses by the VM or use existing data structures like page tables to discern likely temporal locality. Based on this information, the checkpointed state of the VM is reorganized to preserve this temporal locality. Mappings are also maintained in storage to allow the checkpointed state to be restored to the proper locations in memory. Other checkpointing techniques employ compression to minimize the number of disk blocks that need to be written out or read in during the checkpoint process.