Solid-state storage devices (SSDs) store digital data persistently in an integrated circuit. Data stores that use integrated-circuit (IC) based storage media have comparatively lower access and latency times than data storage devices that rely on magnetic properties of the storage medium to save data. Despite the performance benefits provided by SSDs their relatively higher cost per byte of storage capacity has led to their deployment in cache memory portions of a data store. A cache memory is a main memory locations. As long as most memory accesses are to cached memory locations, the average latency of memory accesses is closer to the cache latency than to the latency of main memory.
Whether deployed in a cache, in a primary data store or a secondary data store, IC based storage media require novel approaches for managing input/output operations and providing data protection while minimizing overhead and communicating data.
A conventional storage array or disk array is a logical data store that includes multiple disk drives or similar persistent storage units. A storage array can use parallel data input/output operations to store and retrieve data more efficiently than when a single disk drive or storage unit alone is used to implement a data store. A storage array also can provide redundancy to promote reliability, as in the case of a Redundant Array of Inexpensive Disks (RAID) system. In general, RAID systems simultaneously use two or more hard disk drives, referred to herein as physical disk drives (PDDs), to achieve greater levels of performance, reliability and/or larger data volume sizes. The acronym “RAID” is generally used to describe data storage schemes that divide and replicate data among multiple PDDs. In RAID systems, one or more PDDs are setup as a RAID virtual disk drive (VDD). In a RAID VDD, data might be distributed across multiple PDDs, but the VDD is seen by the user and by the operating system of the computer as a single data store. The VDD is “virtual” in that storage space in the VDD maps to the physical storage space in the PDDs that make up the VDD. Accordingly, a meta-data mapping table is used to translate an incoming VDD identifier and address location into a PDD identifier and address location.
Although a variety of different RAID system designs exist, all have exhibit one or more of two design goals, namely: (1) to increase data reliability and/or (2) to increase input/output (I/O) performance. RAID has seven basic levels corresponding to different system designs. The seven basic RAID levels, typically referred to as RAID levels 0-6, are as follows. RAID level 0 uses striping to achieve improved data reliability and increased I/O performance. When the data is written, it is fragmented. The term “striped” means that logically sequential data, such as a single data file, is said to be “striped” over multiple PDDs when the data is written. Striping improves performance and provides additional storage capacity. The fragments are written to their respective PDDs simultaneously on the same sector. This allows smaller sections of the entire chunk of data to be read off the PDDs in parallel, providing improved I/O bandwidth. The larger the number of PDDs in the RAID system, the higher the bandwidth of the system, but also the greater the risk of data loss. Parity is not used in RAID level 0 systems, which means that RAID level 0 systems have no fault tolerance. Consequently, when any PDD fails, the entire data storage system fails.
In RAID level 1 systems, mirroring without parity is used. Mirroring corresponds to the replication of stored data onto separate PDDs in real time to ensure that the data is continuously available. RAID level 1 systems provide fault tolerance from disk errors because all but one of the PDDs can fail without causing the system to fail. RAID level 1 systems have increased read performance when used with multi-threaded operating systems, but also have a small reduction in write performance as each data write operation must be replicated across the redundant or mirrored drives.
In RAID level 2 systems, redundancy is used and PDDs are synchronized and striped in very small stripes, often in single bytes/words. Redundancy is achieved through the use of Hamming codes, which are calculated across bits on PDDs and stored on multiple parity disks. If a PDD fails, the parity bits can be used to reconstruct the data. Therefore, RAID level 2 systems provide fault tolerance. In essence, failure of a single PDD does not result in failure of the system.
RAID level 3 systems use byte-level striping in combination with interleaved parity bits and a dedicated parity disk. RAID level 3 systems require the use of at least three PDDs. The use of byte-level striping and redundancy results in improved performance and provides the system with fault tolerance. However, use of the dedicated parity disk creates a bottleneck for writing data due to the fact that every write requires updating of the parity data. RAID level 3 systems can continue to operate without parity and no performance penalty is suffered in the event that the parity disk fails.
RAID level 4 is essentially identical to RAID level 3 except that RAID level 4 systems employ block-level striping instead of byte-level or word-level striping. Because each stripe is relatively large, in some situations a single file can be stored in a block. Each PDD operates independently and many different I/O requests can be handled in parallel. Error detection is achieved by using block-level parity bit interleaving. The interleaved parity bits are stored in a separate single parity disk.
RAID level 5 uses striping in combination with distributed parity. In order to implement distributed parity, all but one of the PDDs must be present for the system to operate. Failure of any one of the PDDs necessitates replacement of the PDD. However, failure of a single one of the PDDs does not cause the system to fail. Upon failure of one of the PDDs, subsequent read requests can be satisfied by determining the entirety of the previously stored information from the distributed parity such that the PDD failure is masked from the end user. If a second PDD fails, the storage system will suffer a loss of data. Thus, before a first failed PDD is replaced and the appropriate data is completely reconstructed, the storage system is vulnerable to potential data loss.
However, there is a performance penalty for the additional fault tolerance as each data write operation requires at least two read operations and two write operations as the data strip to be changed and the parity strip both need to be read and the new data has to be written to the modified strip and the parity strip.
U.S. Pat. No. 4,092,732 describes a method of allocating data sequentially and holding or buffering data until the data to be written matches the data storage capacity of a full stripe of the data store. Other conventional storage systems apply full stripe write I/O operations in arrays of PDDs, and arrays of solid state storage elements. While these methods minimize the total number of I/O operations to the PDDs and the sold state storage elements in the respective arrays, the complexity and costs associated with the storage and management of the temporary data makes this approach suboptimal.
RAID level 6 uses striping in combination with dual distributed parity. RAID level 6 systems require the use of at least four PDDs, with two of the PDDs being used for storing the distributed parity bits. The system can continue to operate even if two PDDs fail. Dual parity becomes increasingly important in systems in which each VDD is made up of a large number of PDDs. Unlike RAID-based storage systems that use single parity, which are vulnerable to data loss until the appropriate data on a failed PDD is rebuilt, in RAID level 6 systems, the use of dual parity allows a VDD having a failed PDD to be rebuilt without risking loss of data in the event that a second PDD fails before completion of the rebuild of the data on the first failed PDD.
The previous summary of RAID levels and variants is incomplete. Many variations of the seven basic RAID levels described above exist. For example, the attributes of RAID levels 0 and 1 may be combined to obtain a RAID level known as RAID level 0+1 or RAID 01. RAID 01 is a mirror of stripes. In other systems the attributes of RAID level 1 and RAID level 0 are reversed forming a stripe of mirrors. Many other variations include other nested combinations of one or more of the seven basic RAID levels.