1. Field of the Invention
The present invention relates to a large capacity storage system and, more particularly, to a split parity spare disk achieving method for improving the defect endurance and performance of a RAID (Redundant Arrays of Inexpensive Disks) subsystem.
2. Descaription of the Related Art
The performance of a computer system depends on a central processing unit an input/output subsystem. Recently, the develpoment of a VLSI (Very Large Scale Integration) technique has led to a great improvement in the processing speed of the central processing unit. However, since the input/output subsystem shows a slight improvement in performance, the proportion of input/output processing time to the entire processing time of the system has gradually increased. Furthermore, as there has been a gradual increase in data recovery costs during the occurrence of an error in the input/output subsystem, there has arisen the necessity of developing a input/output subsystem having a superior performance and reliability. One research on an improvement in the performance of the input/output subsystem concerns a RAID subsystem. A general input/output subsystem sequentially inputs/outputs data in/from one disk drive, whereas the RAID subsystem implements an input/output operation in parallel by distributively storing data in a disk array consisting of a plurality of disk drives. Hence, the input/output operation is rapidly processed. Even if there is an error, since it is possible to recover data by using simple parity information, the reliability is improved. Currently, a technique related to the RAID subsystem is in a commercially available stage beyond a theory establishing stage. In universities, an active theoretical study has been done through a study on the RAID algorithm and an experiment using simulation. Enterprises have endeavored to improve the input/output performance and to ensure the reliability by deriving things to be reformed through various performance measurements. The disk array has been used in a supercomputer such as Cray for the input/output improvement of the disk drive. The concept of the RAID subsystem was established with a publication by three computer scientists of Berkeley University in the United States in 1988. The RAID theory is applicable to a sequential access device such as a cartridge tape among input/output devices, but its main concern is about a hard disk device.
A RAID structure is classified into 6 RAID levels from level 0 to level 5 according to its characteristics. The 6 RAID levels have merits and demerits according to environments suitable for each characteristic and are used in various application fields. Each RAID level provides a solution to various data storage devices or a reliability problem. The contents of each RAID level will now be described.
RAID Level 0
The RAID level 0 takes an interest in the performance rather than the reliability of the data. The data is distributively stored in all the disk drives of the disk array. Different controllers are used to connect the disk drives of the disk array to each other. The RAID level 0 has an advantage in that the input/output performance is improved by simultaneously accessing the data by using the different controllers.
RAID Level 1
The contents of all the disk drives are identically stored in a copy disk drive. Such a method is called a mirroring system. The mirroring system improves the performance of the disk drive but has an economic burden. That is, the RAID level 1 has a disadvantage in that only 50% of the disk is used in a system requiring a disk space of large capacity such as a database system. However, since the same data exists in the copy disk drive, the RAID level 1 is preferable in the maintenance of the reliability.
RAID Level 2
The RAID level 2 is used to reduce the economic burden of the RAID level 1. The RAID level 2 distributively stores data in each disk drive constituting the disk array by the unit of a byte. A hamming code is used to recognize and correct an error. Hence, the RAID level 2 has several check disk drives in addition to data disk drives.
RAID Level 3
When there is needed an input/output operation, the input/output operation of the data to the disk drive is carried out in parallel. Parity data is stored in an additional disk drive. A spindle motor for driving the disk is synchronized so that all the disk drives may simultaneously input/output the data. Therefore, it is possible to transmit the data rapidly even if the input/output operation of the data is not simultaneously implemented. If there is a failure in one disk drive, the failed data can be recovered by using the disk drive which is normally being operated and the parity disk drive. In this case, an entire data rate is lowered.
The RAID level 3 is used in a supercomputer, an image manipulation processor, etc. requiring a very fast data transmission rate. The RAID level 3 shows high efficiency in the transmission of a long data block (for example, about 50 data blocks) but it is ineffective in a short data block (for example, about 5 data blocks). The RAID level 3 uses one disk drive for redundancy together with the data disk drive. Consequently, the RAID level 3 needs the smaller disk drive in number than the RAID level 1, but the controller becomes complicated and expensive.
RAID Level 4
In the RAID level 4, data is stripped across to a plurality of disk drives constituting the disk array. In other words, a storage area of each disk drive is divided into a plurality of regions each having a striping size of the unit of a block, and the data corresponding to the striping size is stored across in each disk drive. Parity data calculated by using the data is stored in an additional disk drive within the disk array.
The RAID level 4 can be recovered when the data fails, and its read performance is similar to the RAID level 1. However, the write performance is considerably lowered in comparison with a single disk drive since the parity information should be supplied to an a specially provided disk drive (in this case, a bottle neck phenomenon is generated). The RAID level 4 is compensated by the RAID level 5 of which the write performance is improved.
RAID Level 5
In the RAID level 5, data is striped across in each disk drive of the disk array. In order to eliminate the bottle neck phenomenon during writing, the parity data is distributively stored in all the disk drives. When writing the data, since the data written in all the disk drives should be read to again calculate the parity data, the speed is as slow as the RAID level 4. However, it is possible to simultaneously process the data input/output transmission. The data of the failed disk drive can be recovered.
RAID level 5 is effective in writing long data. If data read is given much weight in an application or the write performance is given much weight in array design, RAID level 5 may be effective in writing short data. If the size of the data block is reduced, the proper performance and data availability can be obtained. RAID level 5 is very effective in cost in comparison with a non-array device.
RAID level 5 has a structure without the loss of data even if one disk drive constituting the disk array fails. However, when the disk drive fails, if instantaneous recovery work is not done, there may be additional failure and thus the loss of data may be generated. To prevent the loss of the data, the RAID level 5 has an on-line spare disk drive or a hot-spare disk drive.
U.S. Pat. No. 5,530,948 to S. M. Rezaul Islam entitled System And Method For Command Queuing On Raid Levels 4 And 5 Parity Drives provides further discussion of the various RAID levels and contemplates a system providing a set of mass storage devices that collectively perform as one or more logical mass storage devices utilizing command queuing on parity drives in RAID level 4 and RAID level 5 systems.
U.S. Pat. No. 5,388,108 to Robert A. DeMoss, et al., entitled Delayed Initiation Of Read-Modify-Write Parity Operations In A RAID Level 5 Disk Array contemplates a method of generating new parity information by reading old data and old parity information from first and second disk drives, respectively, and exclusively ORing the old data and old parity information with new data.
U.S. Pat. No. 5,331,646 to Mark S. Krueger et al. entitled Error Correcting Code Technique For Improving Reliability Of A Disk Array contemplates a system having a large number of data disk drives, a plurality of parity disk drives and a pair of spare disk drives, wherein each data disk drive is included in at least two parity chains and there are no two data drives associated with the same combination of parity chains.
It is noted here that the spare drive within the disk array is not used when the disk array is normally operated, that is when there is no drive failure requiring the system to replace any of the data disk drives or the parity disk drives. Accordingly the non-use of the spare disk drive is a waste of resources. Consequently, the performance of the above noted RAID subsystems is lowered.