In typical RAID computer storage systems, the storage controllers of the storage system present to the host system a set of logical volumes comprised of one or more of the physical disk devices. The storage system provides full data path access to the storage by employing redundant storage controllers. In the event of a single storage controller failure, the redundant controller will take over access control to the volumes that had been under the control of the now failed controller.
Presently, two methods are used to update parity during a write operation. One method creates new parity from the old parity, old data, and new data. The second method creates new parity from new data and other data. For performance reasons, the method employed for any single write operation depends upon the number of data drives that must updated with new data. The first method, creating new parity from the old parity, old data, and new data, is faster than the second method, creating new parity from new data and other data, if there are relatively few data drives that must be updated.
Because the first method creates new parity using old parity, parity will be valid after the take-over operation only if it was valid prior to the operation. However, new parity will be invalid if old parity was invalid prior to the take-over operation. The second method does not share this problem since it does not use old parity when calculating new parity.
It is desirable to reduce the opportunity for data/parity mismatches on parity protected RAID devices following a storage controller failure. Under certain storage volume configurations, there is no redundant information stored between the storage controllers that can be used to identify disk writes that may have been interrupted due to a controller failure. These interrupted disk writes, if not properly handled, lead to data/parity inconsistencies within the parity stripe to which the writes were directed. Because the interrupted writes are re-tried by the host, the volume data will still be accurate. However, if at some point data in the affected parity stripe is required to be reconstructed from the inaccurate parity, the reconstructed data will be incorrect.
This data/parity inconsistency due to interrupted writes is a well-documented, inherent attribute of RAID 3 and RAID 5 devices. In short, new parity that has been generated based on invalid parity will still be invalid. In the past this problem has been solved using two common approaches.
In the first approach, following a controller failure, the surviving controller is used to scan affected volumes to determine if there are any data/parity inconsistencies. If any inconsistencies are detected, they may then be corrected. In the second approach, redundant information is shared between the storage controllers such that, after failure of a controller, the surviving controller can immediately and accurately recover the interrupted writes. Both approaches have been found to be lacking.
Using the first approach, the volume scan should ideally be completed before other I/O's are allowed to the effected volumes. However, scanning the entire volume for data/parity inconsistencies may be extremely time consuming. Thus, preventing new I/O's until after the scan would be undesirable. However, allowing I/O's prior to completion of the scan creates an opportunity for drive errors to occur that would require data to be reconstructed from inaccurate parity for parity stripes that have not yet been scanned and repaired.
The second approach requires the use of either a shared inter-controller repository or direct inter-controller communication to allow both controllers access to the necessary data to recover from interrupted writes. Either facility introduces latency associated with every write into the main I/O path, resulting in undesirable I/O performance.