Data storage systems are arrangements of hardware and software that include storage processors coupled to arrays of non-volatile storage devices, such as magnetic disk drives, electronic flash drives, and/or optical drives, for example. The storage processors service storage requests, arriving from host machines (“hosts”), which specify files or other data elements to be written, read, created, deleted, and so forth. Software running on the storage processors manages incoming storage requests and performs various data processing tasks to organize and secure the data elements stored on the non-volatile storage devices.
Data storage systems commonly provide disk drives in arrangements called “RAID groups” (RAID is an acronym for Redundant Array of Independent Disks). Common RAID configurations mirror data between disk drives and/or provide parity segments to establish redundancy. Mirroring and parity enable a RAID group to suffer a failure of a disk drive without experiencing data loss. For example, a system using RAID mirroring may access data of a failed disk drive from a surviving mirror drive. A system using parity may compute missing data based on a combination of data from surviving drives, e.g., by performing exclusive-OR operations on corresponding elements of the surviving drives.
When a disk drive in a RAID group fails, RAID protocols may initiate a repair operation. For example, a system may swap out the failed drive with a spare and begin copying data to the spare drive. In cases where parity is used, the system may compute data for the spare drive from remaining drives and store the computed data in the spare drive. These actions have the effect of restoring redundancy to the RAID group, such that it can suffer another disk drive failure without experiencing data loss.
Unfortunately, conventional approaches for repairing RAID groups after disk drive failures can be burdensome. For example, upon failure of a drive, the rebuild will require reading data from the remaining drives and copying the reconstructed data to a spare drive which may have limited write performance. The speed of rebuilding will, therefore, be bottlenecked by the maximum write throughput for writing the reconstructed data to the spare drive, which increases the risk of permanent data loss if an additional drive should fail before the rebuild process is completed. In other words, bottlenecks can result in long rebuild times, during which the fault tolerance of the group is degraded, creating an increased risk of data loss. Further, the rebuild time increases as the total capacity of individual physical drives in the RAID group grows, adding to the significance of the problem.
To address some of these problems, many data storage systems have begun to adopt mapped RAID techniques. In mapped. RAID, the data storage systems distribute data across RAID extents which are made up of disk extents such that the disk extents of each RAID extent are provided by different physical storage drives. As a result, Mapped RAID allocates spare disk extents distributed across a large pool of drives in the data storage system rather than reserve one or more entire physical drives as spares. Consequently, rebuilding data from a failed drive involves writing data to spare disk extents distributed across multiple drives. Because rebuilding mapped RAID data involves writing to multiple drives in parallel (rather than to a single drive), the speed of the rebuild process is no longer limited by the minimum time required to write all the rebuilt data to a single drive.
It should, however, be appreciated that mapped RAID also has some problems. For example, as discussed above, mapped RAID is a technology which provides for an extent based fully mapped RAID solution where all the extents are dynamically mapped and maintained across an arbitrary number of devices. Thus, the RAID stripes in this model are spread out across all the drives in the system. If there is a large number of drives in the system then it means that that the data is spread out over a large failure domain. If a single drive fails, and a rebuild is needed, the failure domain is all the drives in the system. There is, therefore, a need to address at least this problem.