1. Field of the Invention
The present invention generally relates to a system and method for determining reconstruction formulas for reconstruction of lost data in a storage system, and more particularly, to a system and method for determining reconstruction formulas for partial strip reconstruction including a combination of a direct reconstruction method and a sequential reconstruction method.
2. Description of the Related Art
Generally, erasure codes (e.g., RAID schemes) are fundamental tools for providing data reliability in storage systems in the presence of unreliable disks. Conventionally, RAID4 and RAID5 systems protect against one disk loss or unaligned sector loss (not more than one sector per horizontal slice). Erasure codes that tolerate two disk failures have begun to be deployed. However, better fault-tolerance will be needed as more systems move to Advanced Technology Attachment (ATA) (e.g., non-Small Computer System Interface (non-SCSI)) drives.
Erasure codes such as RAID4 and RAID5 rely on a single level of redundancy (e.g., see P. Massiglia, The RAID Book, St Peter, Minn.: The RAID Advisory Board, Inc., 1997, which is incorporated herein by reference in its entirety) and so can protect against a single disk failure.
Other published algorithms employed by conventional systems and methods are implemented only in the “two disk” loss failure scenario. That is, each specific 2-fault tolerant erasure code generally is published with a specific algorithm for recovery in the “two disk lost” case. More general erasure codes that tolerate T failed disks are published with descriptions on how to recover the entire data on any T lost disks. Particularly, the Reed-Solomon scheme generally is employed, which uses linear algebra over finite fields to solve the “T disk lost” case. However, this is very complicated and typically requires either additional special purpose hardware or complicated and expensive software.
Conventional systems that tolerate 2 or more failed disks present reconstruction algorithms for recovering from such failures. Typically, these reconstruction algorithms provide means for reconstructing all of the lost data on both or all of the failed disks. The published literature generally does not provide algorithms for recovering partial disk data. The full reconstruction algorithms are best suited to the rebuild scenario where all the lost data is recovered and replaced on spare or replacement disks. However, there are scenarios where partial lost data needs to be recovered, but not all the lost data is required. Such a scenario occurs, for example, if 2 or more disks are lost and, prior to rebuild completion, the host issues a read request for a portion of the lost data. The present literature does not directly address this case, but assumes that it will be covered by the full reconstruction algorithms. Such reconstruction algorithms, particularly for the 2-fault tolerant erasure codes, but also in some higher fault tolerance erasure codes, generally involve a sequential or recursive algorithm. That is, they perform a sequence of steps, first recovering one element (sector, block, chunk) of lost data using available data and parity elements, then using that element of recovered data (and possibly other available lost data and parity elements) to recover another element of lost data, etc., until all elements of lost data are recovered. Such recursive algorithms are typically visualized by some geometric or patterned relationship between the data layout and the parity elements (e.g., parity may be computed along sloped lines through the data layout). An example is given below with reference to FIGS. 5, 6 and 7.
Because the typical reconstruction algorithms are sequential in nature and are designed to recover all the lost data, they may not be the most cost effective approaches to reconstruction of partial strips. For example, when the required partial disk data elements appear in the middle or at the end of the recurrence, these methods require reconstructing all the elements in the recurrence prior to the desired lost elements, and therefore consume extra resources reconstructing unnecessary data elements. Such resources include, but are not limited to CPU usage, disk IO costs, memory bandwidth, as well as XOR computations for those erasure codes based on XOR (these may be handled not by a general purpose CPU but by a special purpose XOR hardware engine).
On the other hand, the method provided in U.S. patent application Ser. No. 10/978,389, filed on Nov. 2, 2004, to Hafner et al., entitled “SYSTEM AND METHOD FOR RECOVERY OF DATA FOR A LOST SECTOR IN A STORAGE SYSTEM” provides an efficient means for reconstructing individual lost elements. The method of this patent application is referred to as a direct method in that it does not rely on any sequential data reconstruction, but provides an efficient and cost effective algorithm for reconstructing an element directly from a minimal number of data and parity elements. Such a method can also be utilized for the partial reconstruction problem by applying it to each of the lost data elements of the partial disk. However, such an application does not necessarily take into account that after some data elements are reconstructed, these newly reconstructed data elements may offset the direct cost of reconstructing other data elements nor does it take into account any “geometry” or patterns in the data/parity relations that are found in the design of particular erasure codes. Such patterns, which form the basis for the sequential reconstruct methods, may provide efficient means for reconstruction, not available to the more generic method of the referenced patent application.