A storage system is a processing system adapted to store and retrieve data on behalf of one or more client processing systems (“clients”) in response to external input/output (I/O) requests received from clients. A storage system can provide clients with a file-level access to data stored in a set of mass storage devices, such as magnetic, optical storage disks, flash devices, or tapes. Alternatively, a storage system can provide clients with a block-level access to stored data, rather than file-level access or with both file-level access and block-level access.
Data can be stored on “volumes” comprising physical storage devices defining an overall logical arrangement of storage space. The devices within a volume are typically organized as one or more groups of Redundant Array of Independent (or Inexpensive) Disks (RAID). RAID implementations enhance the reliability and integrity of data storage through the redundant writing of data stripes across a number of storage devices in the RAID group.
In a storage system, data can be lost or corrupted, for example due to media errors, data corruptions, shelf failures, etc. A media error on a storage device occurs when data cannot be read from a particular block or a number of blocks. Typically, storage systems rely on various redundancy schemes to protect against failures. One such known technique provides for mirroring of data at a destination storage system by preferably transferring changes to the data along with metadata. For example, SnapMirror®, a product provided by NetApp, Inc., Inc., Sunnyvale, Calif., can be used to establish and maintain mirror relationship between a source storage system and a destination storage system and to provide data updates to the destination storage system.
Another known mechanism that is employed in a storage system to protect data against failures is RAID technology, which includes various data protection techniques, such as RAID-1, RAID-4, RAID-5, or NetApp's RAID-DP™. The fault tolerance limit of each technique defines the maximum number of errors which can be successfully recovered. As a result, the availability and resiliency of the storage system is very closely related to the RAID protection level utilized. In RAID-1, the contents of a storage device are mirrored at another storage device. Since only half of the available space can be used for data, RAID-1 protection scheme is typically very expensive to employ.
In RAID-4, RAID-5, and RAID-DP™, a data protection value (e.g., redundant parity) is calculated and is stored at various locations on storage devices. Parity may be computed as an exclusive-OR (XOR) operation of data blocks in a stripe spread across multiple storage devices in an array. In a single parity scheme, e.g. RAID-4 or RAID-5, an error can be corrected in any block in the stripe using a single parity block (also called “row parity”). In RAID-DP™, errors resulting from a two-storage device failure can be corrected using two parity blocks, a row parity and a diagonal parity.
Occasionally, a RAID array may experience a situation when a number of errors exceeds the ability of the RAID protection level to correct the error, thereby causing an unrecoverable error condition. The following combinations of errors for RAID-4 may cause an unrecoverable error condition: one failed storage device and one media error on another storage device; one failed storage device and one checksum error on another storage device; two media errors in the same stripe; one media error and one checksum error in the same stripe; one media error and one missing block error. For a dual parity array having RAID-DP, at least triple errors cause unrecoverable error. As used herein, a “media error” occurs when a read operation is not successful due to the problems with the media on which the data reside. A “checksum error” occurs when a data integrity verification signature of a data block is failed. A “missing block error” takes place when the block range of the storage device that RAID attempts to read does not exist.
When an unrecoverable error condition occurs, it may lead to data corruption in the storage system. Currently, when unrecoverable error is encountered by the RAID system, a data block is flagged to indicate that it has an unrecoverable error. If a storage device fails and the data are reconstructed to a replacement storage device, the reconstructed data will be bad if the data block had encountered an unrecoverable error. As a result, a data block with an error is provided to a client device. Other existing techniques create a log of locations of the data blocks with unrecoverable errors. Such a log is created after the unrecoverable errors are detected. On every client request, the log is checked to determine if a data block indicated in a client request has an unrecoverable error. Checking a log entails reading the log from the storage device, if the log is not cached in a memory device. Such a process consumes bandwidth of the storage device and delays processing of the client request. Since the log is not reliable and can be lost if, for example, a storage device where the log resides fails, such a mechanism does not provide sufficient guarantee that the storage system will be able to identify data blocks that sustained an unrecoverable error. As a result, a data block with an error will be provided to a client.
Regardless of which conventional technique is used to keep track of unrecoverable errors, when an unrecoverable error is encountered at a storage system, RAID panics the storage system and marks the corresponding aggregate inconsistent, thereby triggering a file system consistency check operation prior to serving a data access requests. The term “aggregate” is used to refer to a pool of physical storage, which combines one or more physical mass storage devices or parts thereof, into a single storage object. File system consistency check involves scanning the entire file system to determine if all metadata, e.g., file sizes, blocks allocated per file, etc., are consistent. During this process, if more unrecoverable errors are detected, they are added to the log. The file system consistency check may create a “lost and found” data structure indicating missing data. Running a file system consistency check has a number of shortcomings. For example, the file system consistency check does not recover the original client data block that sustained an error. Furthermore, running the file system consistency check leads to disrupting client access to the storage system.
Accordingly, what is needed is a mechanism for improving resilience and availability of a RAID array in a storage system when RAID encounters unrecoverable errors.