Data storage systems are arrangements of hardware and software that typically include multiple storage processors coupled to arrays of non-volatile data storage devices, such as magnetic disk drives, electronic flash drives, and/or optical drives. The storage processors service host I/O operations received from host machines. The received I/O operations specify one or more storage objects (e.g. logical disks or “LUNs”) that are to be written, read, created, or deleted. The storage processors run software that manages the received I/O operations and performs various data processing tasks to organize and secure the host data that is received from the host machines and stored on the non-volatile data storage devices.
Some existing data storage systems have provided RAID (Redundant Array of Independent Disks) technology. RAID is a data storage virtualization/protection technology that combines multiple physical drives into a single logical unit to provide data redundancy and/or performance improvement. Data may be distributed across the drives in one of several ways, referred to as RAID levels, depending on the required levels of redundancy and performance. Some RAID levels employ data striping (“striping”) to improve performance. In general, striping involves segmenting received host data into logically sequential blocks (e.g. sequential blocks in an address space of a logical storage object), and then storing data written to consecutive blocks in the logical sequence of blocks onto different drives. A series of consecutive logically sequential data blocks that are stored across different drives may be referred to as a RAID “stripe”. By spreading data segments across multiple drives that can be accessed concurrently, total data throughput can be increased.
Some RAID levels employ a “parity” error protection scheme to provide fault tolerance. When parity protection is used, one or more additional parity blocks are maintained in each stripe. For example, a parity block for a stripe may be maintained that is the result of performing a bitwise exclusive “OR” (XOR) operation across the data blocks of the stripe. When the storage for a data block in the stripe fails, e.g. due to a drive failure, the lost data block can be recovered by performing an XOR operation across the remaining data blocks and the parity block.
One example of a RAID configuration that uses block level striping with distributed parity error protection is 4D+1P (“four data plus one parity”) RAID-5. In 4D+1P RAID-5, each stripe consists of 4 data blocks and a block of parity information. In a traditional 4D+1P RAID-5 disk group, at least five storage disks are used to store the data and parity information, so that each one of the four data blocks and the parity information for each stripe can be stored on a different disk. Further in traditional RAID, a spare disk is also kept available to handle disk failures. In the event that one of the disks fails, the data stored on the failed disk can be rebuilt onto the spare disk by performing XOR operations on the remaining data blocks and the parity information on a per-stripe basis. 4D+1P RAID-5 is generally considered to be effective in preventing data loss in the case of single disk failures. However, data may be lost when two or more disks fail concurrently.
Other RAID configurations may provide data protection even in the event that multiple disks fail concurrently. For example, 4D+2P RAID-6 provides striping with double distributed parity information that is provided on a per-stripe basis. The double parity information maintained by 4D+2P RAID-6 enables data protection for up to a maximum of two concurrently failing drives.