1. Field of the Invention
The present invention relates to an external data storage subsystem, and, more particularly, to a method and apparatus for storing parity and rebuilding the data contents of failed disks in a disk array.
2. Description of the Related Art
A conventional computer hard disk does not currently balance its operational speed with the speeds of its microprocessor and memory because of the hard disks inherent mechanical characteristics. Microprocessors and memories have significantly enhanced their capacities with the development of semiconductor technology. Whereas, data access time of the hard disk is reduced by less than about 10% per year while the RISC (Reduced Instruction Set Computer) microprocessor has its performance enhanced by more than 50% per year. Hence, the performance of a computer system is degraded with low performance the I/O subsystem based on the hard disk.
Patterson et al, "A Case for Redundant Arrays of Inexpensive Disks (RAID)", Chicago ACM SIGMOD Conf. Report, pp. 109-116, published in 1988, disclosed 6 levels of disk arrays from 0 level to 5 level, classified according to structure and characteristics of the disk array. RAID is composed of a plurality of disks that provide large capacity, make it possible to parallel process to secure high performance, and employ the redundancy to rebuild the data contents of failed disks.
The number of the disks constituting a disk array may be increased in order to store multimedia data together with conventional text data. However, this increases the rate of failure of reading the disks. For example, if the mean time to undergo a failure in reading a single disk is defined as MTTF (Mean Time To Failure), a disk array composed of n disks has the mean time to failure of MTTF/n. In order to secure high reliability of performance with increasing the number of disks constituting a disk array, RAID level 6 is required to rebuild the data contents of two failed disks, for example. RAID level 6 is disclosed in "RAID: High-Performance, Reliable Secondary Storage", reported by Patterson et al, in ACM Computing Surveys, vol. 26, pp. 109-116, published in June, 1994. The data errors of the disk may be in addition to system failure, uncorrectable bit errors, environmental factors, etc. The level 6 defined by Patterson et al. employs reed solomon code, the complexity of which requires high maintenance cost as well as additional hardware.
The method to rebuild the data contents of two failed disks includes a two dimensional parity technique, EVENODD technique, and a redundancy matrix technique. The two dimensional parity technique is disclosed in an article entitled as "Coding Techniques for Handling Failures in Large Disk Arrays" by Patterson et al, Computer Science Tech. Report CSD88-477, Univ. of California, Berkeley, published in December, 1988, where the structure of RAID level 4 is expanded to arrange parity disks both transversely and longitudinally. This technique suffers from two drawbacks, one of which bottlenecks the parity disks because of their undistributed state and the other is it increases the parity disk overhead.
The EVENODD technique is disclosed in an article entitled "EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures" by Blaum et al, IEEE Trans. on Computers, vol. 44, no. 2, pp. 192-202 issued in February, 1995, which is characterized by employing optimum parity disks. Namely, if m (a prime number) disks are used as data disks, only two parity disks may be added, thus minimizing the parity disk overhead. This technique, however, may degrade performance due to bottlenecks as described in the previous two dimensional technique as well as reducing the mean time to data loss (MTTDL) which is inversely proportional to the size of the error correction group. In addition, overhead exists for maintaining the data blocks of each diagonal line. Namely, when a writing operation is performed on the diagonal data blocks of each disk, all the values of the blocks of the diagonal parity disk must be updated.
U.S. Pat. Nos. 5,271,012 and 5,351,246 respectively entitled "Method and Means for Encoding and Rebuilding Data Contents of up to Two Unavailable DASDS in an Array of DASDS" and "Method and Means for Coding and Rebuilding that Data Contents of Unavailable DASDS in Error in the Presence of Reduced Number of Unavailable DASDS in a DASD Array" issued to Blaum et al respectively on Dec. 14, 1993 and Sep. 27, 1994, use diagonal parity and row parity to encode the disk array and rebuild the data contents of two failed disks. However, these patents have the same drawbacks as described for the EVENODD technique, above.
The redundancy matrix technique is disclosed in an article entitled "Efficient Placement of Parity and Data to Tolerate Two Disk Failures in Disk Array Systems" by Chan-Ik Park, IEEE Trans. on Parallel and Distributed Systems, vol. 6, no. 11, pp. 1177-1184, published in November, 1995, where parity blocks are distributed by defining N disks as an N.sup.* N redundancy matrix in order to resolve the bottleneck problem for the parity disks inherently existing in such methods as the two dimensional parity technique, etc. This technique may rebuild the data contents of two failed disks provided the redundancy matrix meets a certain criteria. Although this resolves the bottleneck problem caused by the overhead of the parity disk, by distributing the parity and data blocks according to a disposing algorithm, the redundancy matrix technique incurs additional cost and overhead due to the complexity of the algorithm and the redundant matrix. In addition, an expansion requires rearrangement according to the algorithm.