RAID (Redundant Array of Independent/Inexpensive Disks) is an organization of data on a plurality of disks and, as is well known, has several levels, each of which has different characteristics that affect performance and availability. RAID level 4 (RAID-4) and RAID level 5 (RAID-5) are organizations of an array of n+1 disks that provide enhanced performance through the use of striping and enhanced data availability through the association of a parity block with every n data blocks. The data and parity information is distributed over the n+1 disks. In the RAID-4 organization, all parity data is on a single disk and in the RAID-5 organization, parity data is distributed over all of the disks in the array. The ensemble of n+1 disks appears to the user as a single, more highly available virtual disk.
RAID storage systems can be implemented in hardware or software. In the hardware implementation the RAID algorithms are built into a controller that connects to the computer I/O bus. In the software implementation the RAID algorithms are incorporated into software that runs on the main processor in conjunction with the operating system. In addition, the software implementation can be affected through software running on a well known RAID controller. Both the hardware and software implementations of RAID are well known to those of ordinary skill in the field.
Since RAID-4 and RAID-5 are organizations of data in which the data and parity information is distributed over the n+1 disks in the RAID array, if a single disk fails or if a data block is unreadable, all of the unavailable data can be recovered. A block is the smallest unit of data that can be read or written to a disk. Each disk in the RAID array is referred to as a member of the array. Furthermore, while disks are referred to throughout, any equivalent storage media could be used as would be apparent to one of ordinary skill in the field. RAID-4 is a level of organization of data for a RAID array where data blocks are organized into chunks which are interleaved among the disks and protected by parity and all of the parity is written on a single disk. RAID-5 is a level of organization of data for a RAID array where data blocks are organized into chunks which are interleaved among the disks and protected by parity and the parity information is distributed over all of the disks in the array. A chunk is a group of consecutively numbered blocks that are placed consecutively on a single disk before placing blocks on a different disk. Thus, a chunk is the unit of data interleaving for a RAID array.
The contents of each bit of the parity block is the Exclusive-OR of the corresponding bit in each of the n corresponding data blocks. In the event of the failure of a single disk in the array, the data from a given data block on the failed disk is recovered by computing the Exclusive-OR of the contents of the corresponding parity block and the n-1 data blocks on the surviving disks that contributed to that parity block. The same procedure is followed if a single block or group of blocks is unavailable or unreadable. A block or set of block is repaired by writing the regenerated dam. The regeneration and repair of data for a data block or set of data blocks on a disk in a RAID array is referred to as reconstruction.
There are several circumstances where the data on one of the disks in the array must be regenerated from the remaining disks. The first circumstance is when a disk has failed and it is replaced with a substitute disk which initially contains no data. The second circumstance is when a data block or group of blocks on one of the disks in the array is unreadable and the data and parity information on the remaining disks in the array is used to regenerate the unavailable data and repair the data block or group of data blocks by writing the regenerated data. The third circumstance occurs when computing the contents of parity blocks during certain write operations that operate by reading some of the disks in a RAID-5 array.
In the above situations, if one or more of the data or parity blocks needed in the regeneration are themselves not readable because of an electrical, magnetic or mechanical anomaly affecting that portion of the data, then there is no correctly regenerated data available to write to the unavailable block. Some RAID-4 or RAID-5 organizations may ignore this problem and simply write a meaningless pattern of bits to the unavailable data block as the regenerated data. If appropriate, the meaningless data just regenerated may be sent to the user with an error signal but a subsequent read to the data block will not detect that data is corrupt. The same situation results if a write operation is unsuccessful and the data written to the particular block is inaccurate or meaningless. The written data is accordingly corrupt and the corruption is undetectable. If the contents of the block now containing meaningless data, regardless of how the data has been corrupted, is ever returned to the user application as the result of a subsequent read request or used as an input to any computations internal to the functioning of the array whose results are subsequently returned to the user as the result of a read request, undetected data corruption occurs. Accordingly, it is desirable to identify if a data block is repaired by writing regenerated data that is meaningless or otherwise contains meaningless data and to prevent the meaningless data from being read by the user or subsequently being used in computations internal to the functioning of the array.