1. Field of the Invention
The present invention relates to the field of mass storage devices. More particularly, the present invention relates to disk arrays that can tolerate multiple dependent disk failures or arbitrary double disk failures without losing any stored data.
2. Description of the Related Art
Disks are often organized into arrays for performance and manageability purposes. To prevent a failure of any disk within an array from causing data to be lost, the data is stored in a redundant fashion across the disks of an array so that a subset of the disks is sufficient for deriving all of the data that has been stored in the array. To date, most systems are designed to tolerate a single disk failure. The rationale for designing for a single disk failure is that disk failures should be relatively rare so that when a disk fails, there is enough time to recover from the failure before another failure occurs.
Field data suggests, however, that disks failures may be dependent. That is, a second disk failure within a storage system or a disk array is more likely to occur soon after the first failure. Such dependency could result simply from the fact that the disks within an array tend to come from the same batch of disks, are subjected to the same physical and electrical conditions, handle the same workload and commands from the same controller, etc. Additionally, the act of a disk failing within an array could trigger changes in the system that stress the remaining disks. Even the act of replacing the failed disk could increase the chances of something else going wrong in the array. For instance, the wrong disk could be replaced.
There are several trends in the industry that make single-failure fault-tolerance less and less sufficient. Firstly, more and more disks are being grouped into an array. Accordingly, the chances of having multiple failures within an array are increasing. Secondly, disk capacity is increasing faster than increases in data rate. Consequently, the time to rebuild a disk is generally increasing, thereby lengthening the window during which the array could be vulnerable to a subsequent disk failure. Thirdly, disk vendors are continuing to aggressively increase a real density. Historically, this has caused a reduction in disk reliability can be expected to continue in the future. Fourthly, the cost associated with a multiple-disk failure is increasing. Techniques like virtualization, which can spread a host Logical Unit Number (LUN) across many disk arrays, increase the adverse impact of a multiple disk failure because many more host LUNs could be impacted.
Conventional techniques for recovering from multiple disk failures in a disk array can be broadly classified into double-parity, double mirroring and RAID 51-type schemes. Double-parity type schemes extend RAID 5-type schemes (which use single parity) to use double parity. One disadvantage of a double-parity-type scheme is an inflexibility in the number of disks that are supported, such as a prime number of disks. See, for example, L. Xu et al., “X-Code: MDS array codes with optimal encoding,” IEEE Transactions on Information Theory, 45, 1, pp. 272–276, 1999. Another disadvantage of double-parity-type schemes is that a highly complex update procedure may be required in which each update of a block may require several other blocks to be updated. See, for example, M. Blaum et al., “The EVENODD code and its generalization: An efficient scheme for tolerating multiple disk failures in RAID architectures,” High Performance Mass Storage and Parallel I/O: Technologies and Applications (H. Jin et al. eds.), Ch. 14, pp. 187–208, New York, N.Y.: IEEE Computer Society Press and Wiley, 2001. Yet another disadvantage of double-parity-type schemes is that parity encoding and decoding complexity may be high. See, for example, P. M. Chen et al., “RAID: High-performance, reliable secondary storage,” ACM Computing Surveys, 26, 2, pp. 145–185, June 1994. Each write request incurs at least three disk read operations and three disk write operations. Double-parity-type schemes can tolerate at most two disk failures.
In a double-mirroring-type scheme, data is mirrored twice so that there are three copies of the data. Each write request incurs three disk write operations to update each copy. Double-mirror schemes use three times the storage of an unprotected array.
A RAID 51-type scheme protects data against a single disk failure and mirrors the RAID 5 array to protect up to three arbitrary disk failures. On a write request, two disk read operations and four disk write operations are incurred.
U.S. Pat. No. 5,258,984 to Menon et al, entitled “Method and means for distributed sparing in DASD Arrays,” discloses the even distribution of spare space among all the disks in a disk array for improved performance.
What is needed is an efficient technique for storing data on an array of disks such that the data is still available even when any two disks of the array fail, or when a failure occurs of more than two dependent disks.