Non-volatile random access storage devices store data in pages that can be read or written as commanded by a host system's application. These storage devices provide non-volatile memory that persists across system failures such as a power failure in the host device.
Often an application needs to update multiple pages of data as part of a single compound operation. A failure during a write operation may leave such a compound operation only partially completed. When the application is restarted it needs to recover a consistent state.
FIG. 1 illustrates a general approach to implementing a transactional storage device 200. An application 150 addressing the device is permitted to issue operations to write multiple pages and to read single pages. Each operation is considered a transaction. The application does not issue overlapping operations on the same page, while the transactional storage device ensures that every operation will either complete fully or, if interrupted, appear never to have been started.
A transactional storage device 200 may be implemented using a combination of data structures stored in volatile memory, data structures stored on in non-volatile memory 100, and methods for updating the data structures by reading and writing individual pages on the storage device during normal operation, recovery, and initialization. The initialization method 110 formats the data structures on the ordinary storage device when the transactional storage device is first placed into service. The recovery method 120 rebuilds the data structures in volatile memory and possibly repairs some storage pages on the storage device before resuming normal operation after a failure or other stoppage.
As illustrated in FIG. 1, a storage page 125 includes metadata 124 in addition to the page data 122. The metadata is typically used to store an identification label and an error correction code for the data and metadata in the storage page. Common sizes for typical storage devices are 512 to 4096 bytes of page data and 8 to 128 bytes of metadata.
Transactional write operations may be implemented by means of a remap table 130 and a log of intentions and commits. When writing new data to a page, the old data is never overwritten because a failure might cause both the old data and the new data of the page to be lost. Instead, the new data is written to a free storage page 125 with metadata 124 indicating the page number and a version number. The version number serves to identify which version of the page is most recent. A remap table 130 in volatile memory keeps track of the latest storage page and version number for each page. To handle transactional write operations of multiple pages, the new data for each page is written to the storage device as an intention record. Once all the writes of intention records have completed successfully, a commit record is written to the storage device. Typically the intention records and commit records are organized into a log.
It is to be understood that this background of the technology section is intended to provide useful background for understanding the here disclosed technology and as such, the technology background section may include ideas, concepts or recognitions that were not part of what was known or appreciated by those skilled in the pertinent art prior to corresponding invention dates of subject matter disclosed herein.