The present invention generally relates to memory devices for use with computers and other processing apparatuses. More particularly, this invention relates to the use of solid-state drives in combination with redundant arrays of independent drives (RAID) configurations.
Mass storage devices such as advanced technology attachment (ATA) drives and small computer system interface (SCSI) drives are rapidly adopting non-volatile memory technology, such as flash memory or another emerging solid-state memory technology including phase change memory (PCM), resistive random access memory (RRAM), magnetoresistive random access memory (MRAM), ferromagnetic random access memory (FRAM), organic memories, or nanotechnology-based storage media such as carbon nanofiber/nanotube-based substrates. Currently the most common solid-state technology uses NAND flash memory components as inexpensive storage memory, often in a form commonly referred to as a solid-state drive (SSD).
Briefly, flash memory components store information in an array of floating-gate transistors, referred to as cells. The cell of a NAND flash memory component has a top gate (TG) and a floating gate (FG), the latter being sandwiched between the top gate and the channel of the cell. The floating gate is separated from the channel by a layer of tunnel oxide. Data are stored in (written to) a NAND flash cell in the form of a charge on the floating gate which, in turn, defines the channel properties of the NAND flash cell by either augmenting or opposing a charge on the top gate. This charge on the floating gate is achieved by applying a programming voltage to the top gate. Data are erased from a NAND flash cell by applying an erase voltage to the device substrate, which then pulls electrons from the floating gate. The charging (programming) of the floating gate is unidirectional, that is, programming can only inject electrons into the floating gate, but not release them.
NAND flash cells are organized in what are commonly referred to as pages, which in turn are organized in what are referred to as memory blocks (or sectors). Each block is a predetermined section of the NAND flash memory component. A NAND flash memory component allows data to be stored, retrieved and erased on a block-by-block basis. For example, erasing cells is described above as involving the application of a positive voltage to the device substrate, which does not allow isolation of individual cells or even pages, but must be done on a per block basis. As a result, the minimum erasable size is an entire block, and erasing must be done every time a cell is being re-written.
In stand-alone drives, the above-noted “pre-erase requirement” of the NAND data structure can cause performance degradation. However, with the use of house-keeping functions, such as coalescing and pro-actively erasing blocks containing old or obsolete data (garbage collection) and subsequent reclaiming of the blocks through TRIM functionality, a reasonable status quo can be maintained over most of the life span of a drive. In this context, it is important to note that as many blocks as possible have to be in the “erased state” in order to allow fast write access.
The “pre-erase requirement” of the NAND data structure poses an impediment to the use of NAND flash memory components in redundant arrays of independent drives (or devices), commonly referred to as RAID. A typical implementation of RAID technology employs a RAID controller for combining an array of disk drives into a logical unit where all drives in the array are interdependent. Most implementations of RAID technology employ data striping, which is a known technique for segmenting logically sequential data when storing data to different physical storage devices. The most prevalent forms of true RAID (not counting RAID Level 0 or Level 1) are RAID Level 5 and RAID Level 6. RAID Level 5 typically uses Hamming code based on XOR calculations to generate the checksum of corresponding bit values across the array. In contrast to, for example, RAID Level 4, which uses the same principle and stores the parity data on a dedicated drive, RAID Level 5 uses distributed parity, meaning that the parity values are stored in blocks across all drives belonging to the array using a rotating scheme. As an example, a Level 5 RAID configuration is represented in FIG. 1 as using three data blocks and one parity block for each set of stored data, resulting in four drives (devices).
As known in the art, parity calculations using the XOR operator are widely used to provide fault tolerance in a given set of data. These calculations can be carried out at the system level with a central processing unit (CPU) of a host computer, or by a dedicated microprocessor. As represented in FIG. 2(a), the result of performing the XOR calculation on two different bit values (0 and 1, or 1 and 0) is 1, whereas the result is 0 for two identical bit values (1's or 0's). By extension, any even number of identical bit values (1 or 0) will result in a parity value of 0. In the case of hard disk drives or volatile memory systems (such as SDRAM), this particular feature has no bearing on functionality. However, in the context of NAND-based solid-state drives, and in particular because of their unidirectional programming mode of operation, the XOR result can pose a severe problem. As represented in FIG. 2(b), if a RAID Level 5 configuration contains an even number of drives, then the parity calculation is carried out across an odd number of blocks belonging to a stripe. In contrast, FIG. 2(a) evidences that the parity calculation is carried out across an even number of blocks belonging to a stripe if the RAID Level 5 configuration contains an odd number of drives. For a RAID Level 5 configuration containing NAND flash-based solid-state drives, if a drive erases or else writes “1's” to all bits, the corresponding parity block is programmed as all “0's.” Because of their unidirectional programming mode, NAND flash-based drives do not allow any further update of the block without selectively erasing the particular block on the drive having the parity data for a given stripe. The same problem occurs in all cases where partial pages are being written, in that typically the part of the page that is “not written to” is programmed to all “1's” or FF byte values. Consequently, the parity block will have all corresponding entries programmed to “0.” If the unused part of the page is updated on any block, the parity block must also be updated. However, this is not possible unless the entire data set is moved to a fresh block, starting from “FF” values.
The situation described above can cause an excessive number of unnecessary program/erase cycles for blocks used for parity values. Aside from slowing down the write speed, the result can be excessive wear on these drives. Particularly in the case of data updates, the stripe block allocation across the different devices in the array may not change. Therefore, the drive holding the respective parity data will be rewritten with new parity data to new blocks, leaving all previously used blocks programmed to “00,” which constitutes the worst case scenario for wear, programming and erase time.
In view of the problem outlined above, RAID Level 5 and also RAID Level 6 (dual distributed parity) are effectively crippled in terms of implementation with NAND-based solid-state drives. Therefore, new strategies are needed to adapt these and other RAID configurations using parity calculations for use with NAND flash-based solid-state drives, as well as any other solid-state storage media with similar behavioral characteristics.