FIG. 1 is a schematic diagram showing the main components of a solid state drive (SSD) in accordance with the prior art. The Solid State Drive includes a non-volatile memory in the form of a multi-level cell (MLC) flash memory array 105, a flash controller 110, dynamic random access memory (DRAM) 120, and a host interface 130. The host interface connects to a host computer (not shown in FIG. 1) which interfaces to the flash controller 110. The flash controller 110 interfaces to the flash memory array 105 and the smaller amount of DRAM 120. The DRAM 120 may be integrated on the same chip as the flash memory controller 110 or exist as a separate memory device or devices.
The DRAM 120 may be used to buffer user data for both read and write commands from the host to the flash memory controller 110. It may also be used to store system data such as L2P (Logical to Physical) address tables, an operational log (where the sequence of events processed by the controller can be saved for later inspection), and statistics concerning the read and write activity and SMART (Self-Monitoring, Analysis and Reporting Technology) data logging. This system data is commonly referred to as ‘metadata’ collectively.
For read commands, a portion of the DRAM 120 may be assigned to act as a read cache, where frequently accessed user data may be stored in the cache after reading from the memory array and then subsequent reads for the same user data can be serviced more quickly from the cache. The data in the cache is only a secondary copy of the data held in non-volatile flash memory. Consequently, in the event of a power failure no action need be taken to save or protect the data in the read cache, as the primary copy is always safe in the non-volatile flash memory array.
For write commands, the DRAM 120 may be used as a buffer to act as a staging point for data being sent between the flash memory controller 110 and the flash memory array 105. Typically, writing directly to flash memory is much slower than writing to DRAM. Data written to NAND flash memory must be written in units on flash pages regardless of the host write size. Additionally, the data written to the flash memory must be written in complete flash blocks, also known as flash erase blocks, where a block includes many individual pages.
Ideally, all the flash pages in one flash block should be written at the same time, or at least within a short period and certainly with no long intervening period between the writing of one incomplete block and the completion of the writing of the remaining pages in that block. The reason is that the data in incomplete flash blocks may suffer corruption and errors due to migration of charge across the physical boundary between memory cells in pages that have been written (programmed) and adjacent memory cells in those pages that have yet to be written (i.e., un-programmed and still in the erased state).
Therefore, writing first to a DRAM 120 serving as a buffer confers several advantages, including being able to respond much more quickly to the host to confirm a status that the data has been written, while the actual writing of the flash memory may take place in the background. The writing of data to flash memory can also be made more efficient by consolidating smaller writes into single page writes or even complete erase block writes composed of many pages in order to avoid problems due to charge migration across the programmed/un-programmed page boundary.
The effect of a sudden power failure on flash memory has been documented in the literature. For example, the paper entitled “Understanding the Impact of Power Loss on Flash Memory”, be Tseng, Grupp and Swanson, DAC '11 Proceedings of the 48th Design Automation Conference (2011), describes how writing activity to the flash memory that is stopped in mid-flow may leave the data partially-written in an indeterminate state. In particular, the paper describes how a power failure can “corrupt data already present in the flash device” and “negatively impact the integrity of future data written to the device.”
One way to address some of the problems caused by a power failure is to include a temporary backup power supply to support a graceful shutdown. The power of an SSD may be provided in different ways, but typically comes from a host. When the host power shuts down, the SSD thus also powers down. It is common to provide a temporary backup power supply 140 for a SSD. A power fail detection circuit 150 may be included to detect a power supply failure. The temporary backup power supply 140 may, for example, be a small battery or a super-capacitor. The temporary backup supply may be a separate component or may be packaged with the SSD in a single unit.
However, even when a temporary back-up supply is available, the backup supply is typically designed to have only a limited capability sufficient to support a graceful shutdown. Thus, after a power failure is detected, steps still have to be taken to try to gracefully complete any essential pending operations and to save any essential user data and system metadata stored in volatile DRAM 120 to the non-volatile flash memory array 105. Two of these essential activities are covered in “The Art of SSD Power Fail Protection,” white paper by WD, a Western Digital company (2013), which mentions the importance of saving to non-volatile storage of any ‘in-flight’ data associated with write-back caching and the logical to physical mapping table. The contents of “The Art of SSD Power Fail Protection” are hereby incorporated by reference.
In the event of a power failure, therefore, the first priority is to save any in-flight data that has been acknowledged to the host as having been written. Due to the use of the DRAM 120 as a buffer, there can be situations where a power failure occurs after data is acknowledged to the host, but before all of the associated data in the DRAM buffer 120 has been written to the flash memory array 105. Priority is thus given to save data that has been written to the SSD that has been acknowledged back to the host, but which has only been buffered in the volatile DRAM and which has not yet been written to the flash memory. Data which has not yet been acknowledged need not necessarily be saved, as the host will interpret the associated write command as having failed and will take appropriate action. The industry's “best practice” for MLC flash memory is that after a power failure, the write data in the write data buffer 122 is written in complete upper/lower (most significant and least significant bit) page pairs and in complete erase block units.
The system data 124, or metadata, which includes the logical to physical mapping table, also needs to be saved to have a graceful recovery. In accordance with industry best practice, the system data 124 is stored in MLC mode in complete upper/lower page pairs and in complete erase block units.
The backup power supply 140 comes at a price in terms of hardware size and cost. The backup power supply needs to be scaled to have the required “hold up time,” which is the time the backup power supply will hold up the operating voltages of the SSD for it to function. Once the voltages fall below a critical value, the drive will shut down.
In order to ensure that both the write-cached data and the metadata is saved, the time for both operations to complete must be determined and this defines the minimum hold up time. The minimum hold up time, in turn, will determine the minimum amount of battery capacity or the super-capacitor size of the temporary backup power supply 140.
The extra cost of providing batteries or super-capacitors as a backup power supply 140 in a SSD is not inconsiderable. Additionally, there is a desire in many applications to reduce the size of the SSD, including associated components packaged with the SSD. Therefore, there is a need to minimize the time required to write the essential data to non-volatile memory after a power supply failure of a SSD. Additionally, there is a need minimize the amount of data that has to be written after a power supply failure.