1. Field of the Invention
This invention relates to a RAID array, data storage system having a storage format that includes device metadata on each storage device in the array and RAID protected RAIDset metadata distributed across the storage devices in the RAID array. More particularly, the invention relates to promoting a device-level error as represented by device metadata to a RAIDset-level error as represented by RAIDset metadata in order to restore redundancy.
2. Description of Related Art
In data processing systems, there has been and continues to be an ongoing evolution in increasing the reliability of user data stored on data storage subsystems used by the data processing system. For some time, Digital Equipment Corporation has provided on each of the SCSI disk drives in its storage subsystems a flag bit for each block of data recorded on the disk drive. This flag bit is named the forced error bit or FE bit. Each user data block on the drive has a corresponding FE bit stored on the disk drive. If the FE bit is set to one, it indicates the user data in the block associated with the FE bit is not trustworthy. In other words, the data can be read but, for whatever reason, the data is corrupt and can not be trusted. U.S. Pat. No. 4,434,487 illustrates techniques for generating the FE bit and using the FE bit.
Another technique for adding to the reliability of stored user data is the distribution of user data across multiple storage devices in a RAID array of storage devices. The purpose of a RAID array is to provide redundancy so that user data may be regenerated when individual blocks of data are bad or lost. For example, in a RAID array having five storage devices or members, user data is recorded in four blocks, each of these four blocks is recorded on a separate storage device, i.e. disk drive. In addition, a fifth drive or member is added to the RAID array in order to store a parity block for the other four blocks. The four user data blocks and their parity block are said to form a sliver in the RAID array. A complete description of the RAID disk array technology may be found in The RAID Book, a Source Book for Disk Array Technology, Fourth Edition, edited by Paul Massiglia and published by the RAID Advisory Board, St. Peter, Minn., Sep. 1, 1994, copyright 1994 RAID Advisory Board, Incorporated.
The parity block in a sliver of blocks is created by exclusive ORing the user data blocks in the sliver. The nth bit of the parity block is the exclusive OR (XOR) of the nth bit of each data block in the sliver. If any one of the user data blocks or the parity block is bad, the bad block may be reconstructed by bitwise XORing the remaining blocks in the sliver. When the parity block contains the bitwise XOR of the data blocks in the sliver than the sliver is said to be consistent. Consistency in a RAID array is typically tracked by storing in the controller an indication of which slivers in the RAID array have consistent data blocks.
To date, it has not been possible to restore redundancy when two or more data blocks in a sliver of blocks on the RAID array have bad or lost data.