The present invention relates to preserving the contents of non-volatile storage and, in particular, to preserving the contents of a non-volatile storage during a system failure.
Prior art computer systems typically include a volatile memory for the storage and manipulation of information by an operating system and various software applications, and a non-volatile memory for mass storage of data and computer programs. When software applications behave in unexpected ways, they can cause the operating system to fail in catastrophic ways, referred to colloquially as a xe2x80x9csystem crash.xe2x80x9d When a system crashes, there is no guarantee that the information stored in volatile memory can be salvaged. Moreover, there is a significant chance that the computer system will be interrupted while writing information to non-volatile mass storage, damaging or corrupting the contents of the non-volatile memory.
Typically, the user remedies the system crash by resetting the system. In the resulting boot cycle the operating system typically loses the ability to reference the information contained in the volatile memory or actually initializes the volatile memory, changing or destroying its contents. Similarly, a reboot operation typically destroys the information the computer would need to verify or repair the contents of the non-volatile memory.
Prior art solutions addressing the loss of the contents of volatile memory have taken various approaches. One approach requires a user manually to direct applications to save the contents of volatile memory to a non-volatile memory when significant amounts of information have been processed in volatile memory. An incremental improvement over this approach takes the form of modifications to the software applications themselves, whereupon they save the contents of volatile memory to a non-volatile memory when certain criteria are met. For example, the word processing program Microsoft Word(trademark) from Microsoft Corporation, Redmond, Wash. has an option that automatically saves the contents of documents upon the elapse of a time period selected by the user.
These prior art systems have several failings. First, a failure in the operating system may prevent the functioning of any application-level safeguards. Second, safeguards that rely on regular human intervention are subject to human failings, such as when humans forget to invoke them. Third, safeguards that attempt to substitute application-administered criteria for human judgment and invocation fail in that they cannot guarantee that critical information would be saved when a human user would have chosen to save it.
A second set of prior art solutions to this problem has focused on hardware modifications to preserve the contents of volatile memory during a crash. Some prior art systems are arranged such that every read or write request to an operating system is simultaneously routed to a non-volatile memory. Such a system guarantees a record of memory contents that can be reconstructed during a boot cycle, but suffers from slowness during normal operation, because each transaction is conducted twice, and slowness during a boot cycle, because the operating system must locate the non-volatile record of transactions and reload them. Other prior art systems attempt the same techniques and suffer from the same problems, but reduce the magnitude of the delays by greater selectivity in the transactions actually recorded, or recording transactions in a way that is more amenable to reconstruction. Other prior art systems relying on hardware modification use non-volatile memories, such as electrically erasable programmable read-only memories (EEPROMs), Flash ROM, or battery-backed random-access memory. These systems have several drawbacks, including higher prices than normal volatile memories and the requirement of additional hardware. For example, Flash ROM often requires a charge pump to achieve the higher voltages needed to write to the memory, and suffers a shorter life than normal volatile RAM because of this process. Battery-backed RAMs rely on batteries that are subject to catastrophic failure or charge depletion.
Prior art solutions addressing the integrity of the contents of non-volatile memory have taken several forms. One solution involves equipping the computer with an array of inexpensive mass storage devices (a RAID array, where RAID is an acronym for xe2x80x9credundant array of inexpensive disksxe2x80x9d). The computer processes each write transaction to non-volatile storage in parallel, writing it to each device in the array. If the computer fails, then the non-volatile storage device with the most accurate set of contents available can be used as a master, copying all its contents to the other devices in the array (RAID level 1). Another solution only stores one copy of the transaction information across multiple mass storage devices, but also stores parity information concerning the transaction data (RAID level 5).
A computer whose information is stored in a volatile memory resistant to loss or corruption resulting from system or application crashes would avoid the problems associated with the loss and recreation of data. A computer that used this persistent volatile memory to store write transactions directed to non-volatile storage would similarly avoid wholesale duplication. The elimination of time-consuming data reconstruction would help make possible a fault-tolerant computer that offered continuous availability. The present invention provides those benefits.
The present invention relates to methods and apparatus for providing improved maintenance of consistent, redundant mass storage images. One object of the invention is to store the contents of write transactions to non-volatile storage in a region of volatile memory whose contents are resistant to loss or corruption from system or application crashes and the ensuing reboot cycle, where they are used to repair and complete the contents of the non-volatile storage devices. Another object of the invention is to avoid wholesale copying of contents between non-volatile storage devices.
In one embodiment, one feature of the invention is the presence of non-volatile storage and persistent volatile memory, where the persistent volatile memory is used to store write transactions posted to non-volatile storage. Another feature of the invention is an intermediary program, such as a device driver, that serves as an intermediary between the operating system and non-volatile storage that processes write requests from the operating system directed to non-volatile storage, stores their contents in persistent volatile memory, and then completes the write to non-volatile storage. Yet another feature of the invention is that the contents of the persistent memory region are resistant to initialization or modification during a boot cycle. Another feature of the invention is that the intermediary program processes write requests atomically, preventing the results of incomplete or partial transactions from subsequent loading from the persistent memory region by computer applications.
In another embodiment, one feature of the invention is a computer program that receives write transactions directed to non-volatile storage by the operating system, stores the contents of the write transaction in persistent volatile memory, and then completes the write to non-volatile storage. Another feature of the invention is the marking of transactions in persistent volatile memory as xe2x80x9ccompletexe2x80x9d or xe2x80x9cin progressxe2x80x9d for use during the reboot and recovery process.
In yet another embodiment, the invention is a method providing improved recovery from system failures. One feature is that the method receives a write transaction from the operating system, stores the contents of the write transaction in persistent volatile memory, and then stores the contents of the write transaction in non-volatile storage. Another feature of the invention is the marking of transactions in persistent volatile memory as xe2x80x9ccompletexe2x80x9d or xe2x80x9cin progressxe2x80x9d for use during the reboot and recovery process. Yet another feature of the invention is the selection of those write transactions in persistent volatile memory marked xe2x80x9cin progressxe2x80x9d, copying the contents of the uncompleted write transactions from the persistent volatile memory to the non-volatile storage, and then marking the uncompleted write transactions as completed after the successful completion of the copy to non-volatile storage.