In the recent computer systems, data required by host units, implying the main computer in a system of computers or terminal connected by communications links, such as a CPU (central processing unit) are stored into a secondary storage device, and data read/write operations are carried out for the secondary storage device upon receipt of a demand issued from the CPU. As this secondary storage device, generally speaking, a nonvolatile storage medium is utilized. Typically, a magnetic disk unit, an optical disk unit and so on are employed as this nonvolatile storage medium. It should be noted that a disk unit will be referred to as a "drive" hereinafter.
In the recent high-technology computer systems, a strong demand has been made to considerably increase the performances of the secondary storage device. As one of the possible solutions for increasing the performances, a disk array arranged by employing a large number of drives each having a relatively small storage capacity may be considered.
In the report "A case for Redundant Arrays of Inexpensive Disks (RAID)" written by D. Patterson, G. Gibson and R. H. Kartz, the performances and reliabilities of the disk arrays (levels 3 and 5) have been described. In the disk array (level 3), data is subdivided and the subdivided data are processed in a parallel mode. In the disk array (level 5), data is distributed and the distributed data are independently handled.
First, a description will be made of the disk array at the level 3, in which the data is subdivided and the subdivided data are processed in the parallel mode. The disk array is arranged by employing a large number of drives each having a relatively small capacity. One piece of write data transferred from the CPU is subdivided into a plurality of subdivided data which will then be used to form parity data. These subdivided data are stored into a plurality of drives in a parallel mode. Conversely, when the data is read out, the subdivided data are read out from the respective drives in a parallel mode, and these subdivided data are combined which will then be transferred to the CPU. It should also be noted that a group of plural data and error correction data will be called a "parity group". In this specification, this terminology will be also employed in such a case that error correction data does not correspond to parity data. This parity data is used to recover data stored in one drive where a fault happens to occur, based upon data and parity data stored in the remaining drives, into which the subdivided data have been stored. In such a disk array arranged by a large number of drives, since the probability of the occurrences of faults is increased due to an increased number of components, such parity data is prepared to improve the reliability of the disk array.
Next, the disk array at the level 5 in which data is distributed and the distributed data are independently handled, will now be explained. In this disk array, a plurality of data is not subdivided but rather is separately handled, parity data is produced from a plurality of data, and then these data are distributively stored into drives each having a relatively small capacity. As previously explained, this parity data is used to recover data stored in a drive where a fault happens to occur during an occurrence of such a fault.
Recently, in the secondary storage device of the large-scale general purpose computer system, since one drive is used in response to other read/write commands, this drive cannot be used and therefore, many waiting conditions happen to occur. In accordance with this disk array, since the data are distributively stored into the plural drives, even when the number of read/write demands is increased, the data are distributively processed in the plural drives, so that such waiting conditions for the read/write demands are suppressed.
In the secondary storage devices of these disk arrays, the storage positions (addresses) for the respective data are fixed to predetermined addresses, and when either data read operation, or data write operation is performed from the CPU, this CPU accesses these fixed addresses.
In the disk array at the level 5, where the data is distributed and the distributed data are independently handled, similar to the large-scale general purpose computer system, a large process overhead is required when the parity data is newly produced during the write operation. This large process overhead will now be explained.
FIG. 11 schematically shows the parity data forming method in the disk array at the level 5 during the data writing operation, in which the data is distributed and the distributed data are independently handled, which has been described in the above-described publication "RAID" proposed by D. Patterson et al. The data present at the respective addresses corresponds to a unit accessed in a single read/write process, and the respective data are independent. Also, in the architecture described in the RAID, the addresses for the data are fixed. As previously explained, in such a disk array system, it is absolutely required to set the parity data in order to improve the reliability of this system. In this disk array system, the parity data is formed by the data at the same address within the respective drives. That is to say, the parity data is produced by the data at the same address (1.1) in the plural drives #1 to #4, and then the resultant parity data is stored into the address (1.1) of the drive used to store the parity data. In this system, the read/write process is performed to access the relevant data in the respective drives, which is similar to the recent large-scale general purpose computer system. In such a disk array, when data is written into, for instance, an address (2.2) of the drive #3, first both of the old data stored at the address (2.2) of the drive #3 and the old parity data stored at the address (2.2) of the drive #5 are read out (step 1), and these read data are exclusive-OR gated with the data to be written to newly produce parity data (step 2).
After the parity data has been formed, the write data is stored into the address (2.2) of the drive #3 and the new parity data is stored into the address (2.2) of the drive #5 (step 3).
As shown in FIG. 12, in the disk array at the level 5, when the old data stored in the data storage drive and the parity data stored in the parity data storage drive are read out, an average waiting time for the disk is required for 1/2 turn thereof. Thereafter, these data are read to produce new parity data. Then, to store this newly produced parity data and the data, a further waiting time is required for 1 turn of the disk. As a result, the averaged waiting time of 1.5 turns of the disk is required to rewrite the data. In the drive, such a waiting time for 1.5 turns of the disk may cause a great process overhead. To reduce such a process overhead required in the data writing operation, the method for dynamically converting the address of the data write destination has been opened in PCT patent applications WO 91/16711 and WO 91/20076 filed by Storage Technology Cooperation (simply referred to an "STK").
In accordance with this conventional method, when any one of the data designated by the CPU is updated, the overall data belonging to the virtual track containing this data are read into the cache memory, a portion thereof is updated by the update data, and when the updated data belonging to the virtual track is taken out, this data is subdivided in a sector unit of the physical track. A new parity group is produced from the subdivided data, or a combination of the subdivided data and other write data, and then this new parity group is written into the empty region of the drive. In this case, the data belonging to the original virtual track is invalidated. A length of data for constituting a parity group is determined in such a manner that this data owns a capacity of a single physical capacity of a single physical track of a drive. At the proper timing, valid data are acquired from the partially invalidated cylinder containing the invalidated write data, and then are written into other regions, so that this cylinder is made of such a cylinder constituted by only empty regions.
In accordance with this method, since the data constituting the parity data has the length of the physical track, the capacity of the cache memory for holding a plurality of data becomes great until the parity group is formed.