In the recent high-technology computer systems, a strong demand has been made to considerably increase the performance of the storage device. As one of the possible solutions for increasing the performances, a disk array arranged by employing a large number of drives each having a relatively small storage capacity may be considered.
In the report, "A case for Redundant Arrays of Inexpensive Disks (RAID)" written by D. Patterson, G. Gibson and R. H. Kartz, the performances and reliabilities of the disk arrays (levels 3 and 5) have been described. In the disk array (level 3), data is subdivided and the subdivided data are processed in a parallel mode. In the disk array (level 5), data is distributed and the distributed data are independently handled.
First, a description will be made of the disk array at the level 3, in which the data is subdivided and the subdivided data are processed in the parallel mode. The disk array is arranged by employing a large number of drives each having a relatively small capacity. One piece of write data transferred from the CPU is subdivided into a plurality of subdivided data which will then be used to form parity data. These subdivided data are stored into a plurality of drives in a parallel mode. Conversely, when the data is read out, the subdivided data are read out from the respective drives in a parallel mode, and these subdivided data are combined which will then be transferred to the CPU. It should also be noted that a group of plural data and error correction data will be called a "parity group". In this specification, this terminology will also be employed in such a case that error correction data does not correspond to parity data. This parity data is used to recover data stored in one drive where a fault happens to occur, based upon data and parity data stored in the remaining drives, into which the subdivided data have been stored. In such a disk array arranged by a large number of drives, since the probability of the occurrences of faults is increased due to an increased number of components, such parity data is prepared to improve the reliability of the disk array.
Next, the disk array at the level 5 in which data is distributed and the distributed data are independently handled, will now be explained. In this disk array, a plurality of data is not subdivided but rather is separately handled, parity data is produced from a plurality of data, and then these data are distributively stored into drives each having a relatively small capacity. As previously explained, this parity data is used to recover data stored in a drive where a fault happens to occur during an occurrence of such a fault.
Recently, in the data storage device of the large-scale general purpose computer system, since one drive is used in response to other read/write commands, this drive cannot be used and therefore, many waiting conditions happen to occur. In accordance with this disk array, since the data are distributively stored into the plural drives, even when the number of read/write demands is increased, the data are distributively processed in the plural drives, so that such waiting conditions for the read/write demands are suppressed.
In the data storage devices of these disk arrays, the storage positions (addresses) for the respective data are fixed to predetermined addresses, and when either data read operation, or data write operation is performed from the CPU, this CPU accesses these fixed addresses.
An important element in a RAID 5 system is to ensure that the data is readable all of the time. The readability is important when one drive is down and data has to be read from all other drives to perform RAID data reconstruct. If there is any problem reading data from the others it can cause a failure when reconstructing data. In addition the greater the number of hard disk drives (HDDs), the greater the probability of hitting an unreadable portion.
Another problem is that periodically there are hardware or microcode problems in which data on one of the drives is corrupted but not detected even though the data and parity are inconsistent. Accordingly, it is important in this instance to determine which HDD contains the corrupted data.
Accordingly what is needed is a system and method for addressing the above mentioned problems in reconstructing corrupted data in a RAID system. The present invention addresses this need.