In the data storage field, many storage controllers contain a mechanism to back up data from a volatile memory to a non-volatile storage when a primary power source fails. The primary power source is typically a main utility power source. Subsequently, when the primary power source returns, the data is restored from the non-volatile storage to the volatile memory. Herein, the term “backup” indicates that availability of data of the volatile memory is maintained by storing the data into the non-volatile storage. The term “restoring data” means that the backed up data of the non-volatile storage is stored into the volatile memory. Typically, the volatile memory retains information or data as long as the power source is applied, but when the power source is off or interrupted the stored data is lost from volatile memory. In contrast, the non-volatile storage does not require a maintained power source to retain data persistently. It is known to selectively back up important data and/or applications that should not be lost upon power failure. These data are also referred to as “persistent data” in the present description.
Many computer systems (e.g., servers or workstations) make use of volatile memory such as Dynamic Random Access Memory (DRAM) to hold operating instructions and data, and include a large amount of non-volatile storage in hard disk drives. Non-volatile storage can also be provided by flash memory. Standard servers are often used as the platform for storage controllers, such as the IBM® System Storage SAN Volume Controller (SVC) product, the IBM Storwize V7000 storage system and the IBM® System Storage DS 8000® family of products from IBM Corporation.
In those data storage products, controller software is required to store data into a non-volatile storage when a primary source powering the storage product fails. The controller software often requires several Gigabytes of such non-volatile storage to preserve the data persistently when the power source has failed; for example, to implement a write cache or to store metadata such as copy services bit maps.
Non-Volatile Dual-Inline Memory Modules (NV-DIMMs) are emerging as a new technology for storage systems, which address the need to preserve persistent data when the primary power source fails. These Non-Volatile DIMMs are generally intended for use as main memory when installed in computer systems such as servers and workstations, and they behave like DRAM in normal operation. In DIMM technology, random access memory modules are mounted in line on a circuit board, with electrical contacts provided on each side of the modules (e.g., 240 pins). In Non-Volatile DIMMs, the random access memory modules such as DRAM modules are combined with at least one flash memory module, a control chip and a backup power input. For illustrative purposes only, an exemplary structure of a Non-Volatile DIMM is shown in FIG. 1.
Several types of DIMMs are available. Most servers use unbuffered DIMMs (UDIMMs) or Registered DIMMs (RDIMMs), but Fully Buffered Double Data Rate Synchronous DRAM Dual In-line Memory Modules (DDR SDRAM FB-DIMMs) are also available. The standard specifying the FB-DIMM is evolving and can be consulted as its latest versions are released at the following address: “JEDEC Solid State Technology Association industry standards body; www.jedec.org”. In high-end servers, an Advanced Memory Buffer (AMB) can be added to allow more DIMMs to be attached to a CPU. The AMB buffer communicates with a memory controller over a high speed serial interface o provide high throughput.
To aid understanding of how the invention can be applied to solve a problem with DIMMs, FIGS. 1 and 4 of this specification illustrate an exemplary type of Non-Volatile DIMM having a plurality of DRAM modules. However, it will be appreciated that the invention is not limited to the particular type of DIMM shown in these figures. A known Non-Volatile DIMM structure is described with reference to FIG. 1. A known method of operation for backing up data when power fails and for restoring data when power is re-established is described with reference to FIG. 4.
FIG. 1 shows a Non-Volatile DIMM module 1 including non-volatile storage and volatile memory. As illustrated in FIG. 1, a known Non-Volatile DIMM comprises at least one DRAM chip 5, at least one Flash memory chip 2 and a hardware control module 3 (all located on the DIMM's printed circuit board) and a backup power source 4. FIG. 1 illustrates a plurality of DRAM modules 5, each of which comprises a part of the DIMM's DRAM 15. The at least one Flash memory module 2 comprises a Flash memory 12. The DRAM 15 provided by each of the plurality of DRAM modules 5 is part of the volatile memory and the at least one Flash memory 12 constitutes the non-volatile storage. The DRAM modules 5 are each connected to an internal bus 9 via a respective Bus switch 10 and the bus is connected to the control module 3. Signals can be exchanged between the respective DRAM modules 5 and the control module 3 via the bus 9. The at least one Flash memory module 2 is connected to a hardware Flash controller module 6. The control module 3 of the Non-Volatile DIMM is connected to the Flash controller 6 via a standard internal interface; for example, a Serial Advanced Technology Attachment (SATA) interface 8.
The Control module 3 is an initiator, sending read and write commands to the Flash controller 6, and this Flash controller is a target that responds to commands from the control module 3. During normal operation, the Non-Volatile DIMM 1 is powered by a primary power source (not shown on FIG. 1). In the event that the primary power source for the Non-Volatile DIMM 1 fails (step S41), a power control unit 7 identifies the power failure and configures the bus switches 10 to disconnect the DRAM's from the DIMM pins 13 and connect them to bus 9. The backup power source supplies sufficient temporary power to the Non-Volatile DIMM, under control of the power control unit 7, for transferring data from DRAM 15 into the Flash memory 12 (step S42). The control module 3 automatically takes control over the DRAM 15 for dumping the entire content of the DRAM 15 to the Flash memory 12. The control module controls reading of data from DRAM and writing of data to the flash memory 12 in the flash memory module 2 (step S43). This automatic data transfer from volatile memory to non-volatile storage occurs when the primary power source fails and before energy from the backup power source 4 is depleted. This non-selective backup process is often called a “Fire Hose Dump”, referring to its transfer of the full contents of the DRAM 15. When the primary power source has failed (step S41), the power provided by the backup power source 4 is originated from an external super-capacitor or from a secondary battery connected thereto. A connection between the secondary battery and the backup power unit 4 can for example be provided by means of a flying lead. In the event that the primary power source is restored (step S44), the control module 3 automatically restores the totality of the data stored in the Flash memory 2 to the DRAM 5 (step S45) and the backup power source 4 subsequently recharges (step S46).
Current Non-Volatile DIMMs make use of the Flash controller module 6 to manage wear levelling and bad blocks in the Flash memory. Wear levelling is a process for extending the lifespan of non-volatile storage such as Flash memory or solid state disks. As each block can tolerate a finite number of write/erase cycles before becoming unreliable, wear leveling arranges for the number of write/erase cycles to be distributed evenly among all of the blocks in the storage. Hence, wear levelling is meant to prevent using the same blocks of storage a significantly greater number of times.
A Non-Volatile DIMM 1 connects to a computer system motherboard (Printed Circuit Board of a server or a workstation) by means of connection pins that are not shown on any figure. Through these pins, data, address and control signals are exchanged with the motherboard. Arrows 13 on FIG. 1 indicate the bidirectional nature of the data input/output pins that are used to transfer data signals between the DIMM and the computer's motherboard during the reading and writing operations.
Non-Volatile DIMMs offer several advantages. The backup power source 4 is connected to provide required power for backup purposes without requiring all components of the computer system (e.g., the server motherboard) to be powered during the “Fire Hose Dump”. This significantly reduces the energy and peak current that the backup power source 4 must provide. Moreover, Non-Volatile DIMMs can be directly installed in place of regular DIMMs on the motherboard of a standard computer system (e.g., server) as they have the same form factor and electrical interface. Hence, the Non-Volatile DIMM provides non-volatile memory without making any other change to the system hardware. This contrasts with other means of adding non-volatile memory to a computer system. In general, adding non-volatile storage often involves adding custom hardware to the computer system. In addition, DIMMs provide high-speed, high-density memory for servers, workstations, networking equipment, desktop computers and the like.
Typical Non-Volatile DIMM products are able to restore (step S45) all of the data from non-volatile storage to volatile memory when power is re-established. However, there are some problems with known products. Firstly, the contents of the Non-Volatile DIMM can be over-written during the running of some standard start-up test routines referred to as Power-On Self Test (POST), which take place immediately after the server is powered and are used to check whether the computer's RAM, storage drives, peripheral devices and other hardware components are working properly. Secondly, a standard operating system such as Linux sees the DRAM of the Non-Volatile DIMM 1 as ordinary memory for it to allocate to any process (just like any other memory) and therefore it could overlay some of the data contained therein with other information such as program code during a reboot. Since standard POST diagnostic test routines are handled by the system's BIOS (Basic Input/Output System), one possible solution to avoid the above mentioned problems consists in modifying the operating system and BIOS of the computer system to prevent over-writing storage of the Non-Volatile DIMM whilst the computer system reboots. One problem that could remain is that the step of restoring the contents of Flash memory to DRAM may over-write important data or program code stored in the DRAM 15, if the operating system has used any of the original memory locations before the restore operation. This may cause a system crash.