The present invention relates to a method and apparatus for a storage controller for an array of disks. In particular, the present invention relates to a method and apparatus for a RAID6 controller.
An array of disks can be configured from independent disk drives into a single logical unit for the purposes of data redundancy and/or performance. RAIDS is an example of a disk array configuration that can survive a single drive failure due to having an independent parity value calculated for each element of all the user's data that is being stored. RAID6 is an example of a disk array configuration that can survive a double drive failure due to having two independent parity values calculated for each element of all the user's data that is being stored. Erasure coding is an example of a disk array that can survive a configurable number of drive failures if enough redundant data has been stored in the disk array.
Losing a single parity value means that the array's protection is degraded, but simultaneously losing a second means the array is critically degraded and can now not cope with any additional failures. In a standard RAID6 array this status applies to the whole array, as every stride uses every drive, so every stride is impacted equally. It does not matter whether the drive that is providing data to a stride is being used for parity, or data values in that stride as both must be used in recovery.
A strip is a segment of data on a disk (see strip 11 in FIG. 1B). A strip is part of a stride, a stride is a sequence of strips across a respective sequence of disks (see stride 1 in FIG. 1B). A stride comprises three data strips and two parity strips in the example of FIG. 1B but two, three, four or more data strips can be part of a stride. For RAID6 two parity strips are necessary.
Distributed RAID6 takes the RAID6 concept and distributes it over a larger set of drives than the number of drives used in a single parity calculation (this is sometimes termed ‘wide-striping’). This means that every stride occupies a subset of all the drives in the distributed array. This means that on the failure of a single drive or two drives, not all strides (which is the term used for the set of data values related by common parity) are impacted equally. In distributed Raid6, in the event of a double failure some strides will be impacted by both failures, some by one of the failures and others strides have neither of the failing drives in them and are not impacted at all.
In approaches used to date, until the critical strides (impacted by both failures) can have at least one parity or data value rebuilt, another failure will result in loss of data. Anything that delays the completion of rebuild will delay the exit of this critically degraded state.
As strides are required to be spread across drives, load balancing means that the sets of drives involved in any stride are spread so that no two drives are paired in one parity set for recovery purposes, significantly more than any other two drives. An array is said to be in a critical state when a small percentage of strides are critical and critically degraded when no spares are available.
If an array is critically degraded, the longer this state continues, the higher the priority of suffering a subsequent failure while critically degraded. If a drive fails, the probability of suffering two subsequent failures is the square of the probability of suffering one more. Additionally, if the system is going to suffer two more failures it is quite possible that these will not occur at the same time and the system may be able to react to the first subsequent failure prior to the second occurring.
In a traditional RAID array, losing 4 drives (out of 11 for example) is only likely to happen when there are cross drive issues like storage networking, power or drive enclosure problems. However, if a distributed array has several hundred drives the chance of this event occurring due to individual drives failure becomes much higher, particularly as it is much more likely with such systems that user will tend to ‘batch up’ drive replacement procedures and perhaps not act on a single, or perhaps even double, failure using (distributed) spare usage to keep operational costs involved in human maintenance down.
Distributed arrays have an increased need over traditional arrays to tolerate more drive failures prior to requiring user intervention as it will be more ‘normal’ to run arrays degraded and more normal, with large sets of drives, to observe multiple simultaneous failures