Failure of a computer, as well as application programs, executing on a computer can often result in the loss of significant amounts of data and intermediate calculations. The cause of failure can be either hardware or software related, but in either instance the consequences can be expensive, particularly when data manipulations are interrupted in mid-stream. In the case of large software applications, a failure might require an extensive effort to regenerate the status of the application's state prior to the failure.
Generally, checkpoint and restoration techniques periodically save the process state during normal execution, and thereafter restore the saved state to a backup process following a failure. In this manner, the amount of lost work is minimized to progress made by the application process since the restored checkpoint.
Traditionally, computers have stored the checkpoint data in either system memory coupled to the computer's processor, or on other input/output (I/O) storage devices such as magnetic tape or disk. I/O storage devices can be attached to a system through an I/O bus such as a PCI (originally named Peripheral Component Interconnect), or through a network such as Fiber Channel, Infiniband, ServerNet, or Ethernet. I/O storage devices are typically slow, with access times of more than one millisecond. They utilize special I/O protocols such as small computer systems interface (SCSI) protocol or transmission control protocol/internet protocol (TCP/IP), and typically operate as block exchange devices (e.g., data is read or written in fixed size blocks of data). A feature of these types of storage I/O devices is that they are persistent such that when they lose power or are re-started they retain the information stored on them previously. In addition, I/O storage devices can be accessed from multiple processors through shared I/O networks, even after some processors have failed.
As used herein, the term “persistent” refers to a computer memory storage device that can withstand a power reset without loss of the contents in memory. Persistent memory devices have been used to store data for starting or restarting software applications. In simple systems, persistent memory devices are static and not modified as the software executes. The initial state of the software environment is stored in persistent memory. In the event of a power failure to the computer or some other failure, the software restarts its execution from the initial state. One problem with this approach is that all intermediate calculations will have to be recomputed. This can be particularly onerous if large amounts of user data must be reloaded during this process. If some or all of the user data is no longer available, it may not be possible to reconstruct the pre-failure state.
System memory is generally connected to a processor through a system bus where such memory is relatively fast with guaranteed access times measured in tens of nanoseconds. Moreover, system memory can be directly accessed with byte-level granularity. System memory, however, is normally volatile such that its contents are lost if power is lost or if a system embodying such memory is restarted. Also, system memory is usually within the same fault domain as a processor such that if a processor fails, the attached memory also fails and may no longer be accessed. Metadata, which describes the layout of memory, is also lost when power is lost or when the system embodying such memory is restarted.
Prior art systems have used battery-backed dynamic random access memory (BBDRAM), solid-state disks, and network-attached volatile memory. Prior BBDRAM, for example, may have some performance advantages over true persistent memory. It is not, however, globally accessible. Moreover, BBDRAM that lies within the same fault domain as an attached CPU will be rendered inaccessible in the event of a CPU failure or operating system crash. Accordingly, BBDRAM is often used in situations where all system memory is persistent so that the system may be restarted quickly after a power failure or reboot. BBDRAM is still volatile during long power outages such that alternate means must be provided to store its contents before batteries drain. Importantly, this use of BBDRAM is very restrictive and not amenable for use in network-attached persistent memory applications, for example.
Battery-backed solid-state disks (BBSSD) have also been proposed for other implementations. These BBSSDs provide persistent memory, but functionally they emulate a disk drive. An important disadvantage of this approach is the additional latency associated with access to these devices through I/O adapters. This latency is inherent in the block-oriented and file-oriented storage models used by disks and, in turn, BBSSDs, which do not bypass the host computer's operating system. While it is possible to modify solid-state disks to eliminate some shortcomings, inherent latency cannot be eliminated because performance is limited by the I/O protocols and their associated device drivers. As with BBDRAM, additional technologies are required for providing the checkpoint state of an application program in a failed domain to a backup copy of the application program running in an operational domain.