1. Field of the Invention
The present invention relates to a recording device and an array-type recording device for use as a storage system in a computer, and, more particularly, to an improvement in the performance and reliability of a disk drive system having a plurality of disk units arranged in an array.
2. Description of the Prior Arts
A variety of literatures and patents have disclosed disk systems comprising a plurality of disk units arranged in an array. Of these literatures, an article titled "A Case for Redundant Arrays of Inexpensive Disks (RAID)", Proc. ACM SIGMOD Conf., Chicago, Ill., June, 1988 published by The Berklay Institute of the University of California describes a method of dramatically improving the reliability of stored data. The article proposed classifying a method of improvement in data reliability into five levels. These range from a conventional mirror disk system to a block interleave parity system. Outlines of the respective levels are described as follows.
RAID level 1
This is of a conventional mirror (shadow) system wherein identical data are stored in two groups of disk unit. The system according to RAID level 1 has been generally employed in systems requiring a high level of reliability. However, redundancy is considerable and the cost for unit capacity is accordingly high.
RAID level 2
A hamming code type which is used in a DRAM is applied to this level. Data are stored in data disks for a redundancy group in a bit interleaving manner. In the meantime, in order to enable correction of one bit error to be executed, ECC codes are written in a plurality of check disks for one group (one group comprising 10 to 25 sets of disks). Four check disks are required for ten sets of data disks. Redundancy is considerable.
RAID level 3
Predetermined parity disks are used and data are recorded in data disks of a certain group in a byte interleaving manner. Since error positions can be located by an ECC for each drive, one set of parity disks may be needed. This system is suitable for a high speed transfer by synchronously rotating spindles.
RAID level 4
Predetermined parity disks are used and data are recorded in data disks of a certain group in a block interleaving manner. A difference between RAID level 4 and RAID level 3 is that a different interleave unit is used in level 4. More specifically, since data are recorded in each block, this system is suitable for a case where a small volume of data is frequently accessed.
RAID level 5
Unlike the afore-said levels 3 and 4, predetermined parity disks are not provided. Parity data are dispersedly stored, i.e., striped in constituent disks. As a consequence, at the time of writing operation, parity disks are not subjected to concentrated loads and IODS may be increased (The greater the rate of writing, the more advantageous will be this system compared to RAID level 4). Performance at the time of use as well as a capacity efficiency are desirable.
An example of a conventional array-type recording device having redundancy is disclosed in Japanese Patent Laid-Open Publication No. 236714/90 wherein a system and method of driving array-type disks are disclosed. In this system, a redundancy level and a logical number of component disk units viewed from a host computer can be selected.
With reference to the striping of parity data, Japanese Patent Laid-Open Publication No. 293356/87 discloses a data protection mechanism and a method thereof.
FIG. 1 illustrates the constitution of a conventional array-type recording device which has been disclosed in afore-said Japanese Patent Laid-Open Publication No. 236714/90. In FIG. 1, the reference numeral 1 designates a host computer; 2 a host interface (hereinafter called Host I/F) which serves as a buffer between host computer 1 and an array controller 13; 3 a microprocessor adapted to control the whole array controller 13; 4 a memory; 5 an ECC engine adapted to generate redundancy data and reconstruct data; 6 a data path commonly connected to host I/F 2, microprocessor 3, memory 4 and ECC engine 5; 7 a CE panel; and 8a through 8e channel controllers. CE panel 7 and the plurality of channel controllers 8a-8e are connected to data path 6. The reference numerals 9a-9e designate disk units; and 10 channels, through which the plurality of disk units 9a-9e are connected to channel controllers 8a-8e; 11 stand-by disks; 12 spare channels, through which plurality of stand-by disks 11 are connected to channel controller 8; and 13 an array controller for controlling plurality of disk units 9a-9e and stand-by disk 11.
Operation of the illustrated recording device will next be explained. In FIG. 1, data are recorded and reproduced by host computer 1 through host I/F 2. Upon storage of data, commands and data from host computer 1 are stored in memory 4 by way of data path 6. Upon reproduction of data, the data stored in memory 4 is transferred to host computer 1 by way of host I/F 2.
Operation of the recording device in the case of RAID level 5 will now be explained. The data stored in memory 4 is divided by microprocessor 3 into data blocks, and disk units in which data is to be written and disk units in which redundancy data is to be written are determined. In RAID level 5, since old data stored in a data block corresponding to the writing operation is required for updating redundancy data, a reading operation is executed prior to the writing operation. Data is transferred between memory 4 and channel controllers 8, 8a-8e by way of data path 6, and redundancy data is generated by ECC engine 5 synchronously with the transfer of data.
For example, in the case of writing data of 1024 byte long, assuming that a data block is set at 512 byte long, the data of 1024 byte long is striped into two blocks whereby writing-data disk units 9a, 9b and a redundancy-data disk unit 9e are determined. Next, under the control of microprocessor 3, ECC engine 5 is activated and a command to read out old data for computing redundancy data is provided to channel controllers 8a, 8b and 8e to which data disk units 9a, 9b and redundancy-data disk unit 9e are connected. After reading the old data out of data disk units 9a, 9b and redundancy-data disk unit 9e, new data is written in data disk units 9a, 9b in accordance with a command from microprocessor 3, and updated redundancy data generated by ECC engine 5 is written in redundancy-data disk unit 9e. Then, the completion of the data writing operation is reported to host computer 1. As described above, upon writing data, old data must be read out in advance for the purpose of generating redundancy data, resulting in a longer processing time.
Reading out data will next be explained. When a data reading command is issued from host computer 1, the data block and the data disk unit in which relevant data is stored are computed by microprocessor 3. For example, if relevant data is stored in disk unit 9c, a command to read the data is issued to channel controller 8c to which disk unit 9c is connected. When the reading of the data from disk unit 9c is completed, the read data is transferred to memory 4 and the completion of reading the data is reported to host computer 1.
Reconstruction of data at the time of occurrence of abnormality and reconstruction of data in stand-by disks will next be explained. Reconstruction of data is executed when, for example, the reading of data out of disk unit 9c becomes impossible. In such a case, data are read by microprocessor 3 from all the disk units of a redundancy group which include the data blocks to be read out by microprocessor 1 and the data of the data blocks corresponding to the data which were unable to be read out are reconstructed by ECC engine.
For example, if the redundancy group includes disk units 9a, 9b, 9c and 9e, data blocks are read from disk units 9a, 9b, and 9e and data are reconstructed in disk unit 9c by ECC engine 5. The reconstructed data are then transferred to memory 4 and the completion of data reading operation is communicated to host computer 1. It is understood that, even if it becomes impossible to read out data due to an abnormality in a disk unit, data can be reconstructed, resulting in the enhancement of data storage reliability.
Reconstruction of data may be executed when, for example, disk unit 9c becomes disabled. In this case, data is read out by microprocessor 3 from all the disk units of the redundancy group which include the data stored in disk unit 9c, and data for disk unit 9c is reconstructed by ECC engine 5. The reconstructed data is reconstructed in any of the stand-by disks.
If the redundancy group is composed, for example, of disk units 9a, 9b, 9c and 9e, then data are read out from disk units 9a, 9b and 9e, while data for disk unit 9c is reconstructed by ECC engine 5. The reconstructed data is written on any one of stand-by disks 11, while the data for disk unit 9c is reconstructed in stand-by disk 11. In this way, disk unit 9 which has been disabled may be replaced with stand-by disk 11. It is to be noted, however, that, during the replacement of disk unit 9c with stand-by disk 11, if any abnormality occurs in stand-by disk 11, such a replacement process becomes complicated. Furthermore, since such a replacement process is executed while the system is in operation, the performance of the system will be degraded during such a replacement process.
Since a conventional array-type recording device is constituted such as described above, the prior art device has serious problems in that, due to the necessity of reading out data in advance for generating redundancy data at the time of a data writing operation even in a normal operation of the system, a processing time becomes correspondingly slower.
The conventional device has a further problem in that, during a process of replacement of disks in the case of abnormality, if a stand-by disk used for replacement is damaged, the performance of the system is further degraded.
The conventional device has still another problem in that a possibility of a fault existing in channels to which constituent disks are connected is not taken into consideration.