Conventional approaches used in RAID (redundant array of inexpensive drives) storage systems are primarily based on either an XOR function (parity calculations) or a mirror function to obtain redundancy and provide fault-tolerance. In RAID 1 and RAID 10 technologies, the drives are mirrored to obtain redundancy. Every time a new write occurs on the media, the entire data needs to be replicated and written onto both a data drive and a corresponding mirrored drive.
Referring to FIG. 1, a RAID 10 approach is shown. The drive DISK0 is shown mirrored to the drive DISK1. The drive DISK2 is shown mirrored to the drive DISK3. RAID 1 and RAID 10 approaches involve mirroring the complete contents of one drive to another drive. If there are two drives configured as RAID 1, where each drive has a capacity C GB, then the total capacity of the RAID group would be C GB (i.e., not the total capacity of both drives of 2C GB). Hence, the overall storage capacity of a RAID 1 or RAID 10 is 50% of the total capacity of all of the drives in the RAID 1 or RAID 10 configuration.
Referring to FIG. 2, a RAID 4 and a RAID 5 approach are shown. A number of drives DISK0, DISK1, DISK2 and DISK3 are shown. In RAID 4 and RAID 5, the data blocks are striped across a number of the drives DISK0-DISK3 of the RAID group. In the RAID 4 configuration shown, the drives DISK0, DISK1 and DISK2 store data. The parity block is stored in a dedicated drive (i.e., shown as the drive DISK3). In a RAID 5, the parity is distributed across all the drives DISK0-DISK4 in the RAID group. In the RAID 5 configuration shown, the drive DISK3 is shown holding data (compared with a RAID 4 where the drive DISK3 only holds parity). A D parity (i.e., a parity of the data block D) is stored in the disk DISK0. A C parity is stored on the DISK2. A B parity is shown stored on the disk DISK2. An A parity is shown stored on the disk DISK3.
RAID 4 and RAID 5 approaches use parity generation based on an XOR function. With RAID 4 and RAID 5, every stripe of data is used to generate parity. The parity generated is then stored in another dedicated drive or distributed across all the drives of the RAID group. RAID 4 and RAID 5 can tolerate only one drive failure at a time without losing data.
Referring to FIG. 3, a dedicated compressed data drive approach is shown. A number of drives DRIVE 1, DRIVE 2 and DRIVE 3 are shown. A drive DRIVE C is also shown. DRIVE 1, DRIVE 2 and DRIVE 3 store uncompressed data D0-D8. The drive DRIVE C is a dedicated drive to store the compressed version of the data D0-D8 as the data C-D0 through C-D8.
The performance of the dedicated compressed data drive method has drawbacks. A single dedicated drive DRIVE C for storing the compressed data of every RAID group. Every new write is split into multiple stripes. For each stripe, compressed data C-D0 to C-D8 is generated and stored in the dedicated drive DRIVE C. If there are n drives in the RAID group, then the compressed data of n−1 stripes are stored in the dedicated drive DRIVE C. Processing the n−1 stripes introduces delays in completing the writes of the compressed data and creates a bottleneck. A system configured to write both the data stripe and the compressed data will synchronously encounter delays because of the queued up writes on the compressed drive DRIVE C. The dedicated compressed data drive method also has fault-tolerance drawbacks. A failure of the dedicated compressed drive and another drive in the RAID group will result in data loss (logical drive failure).
Referring to FIG. 4, a distributed compressed data approach is shown. A number of drives DRIVE 1, DRIVE 2, DRIVE 3, DRIVE 4 are shown. Compressed data is distributed across the drives similar to the way parity is distributed in a RAID 5 approach. Each of the drives DRIVE 1-4 contains a compressed version of a stripe of data from each of the other three drives. The distributed compressed data approach provides better performance than the dedicated compressed data method. However, a failure of more than one drive in the distributed compressed data method will result in data loss.