The present invention is related to U.S. application Ser. No. 07/979,275 filed on Nov. 20, 1992 and Ser. No. 08/034,389 filed on Mar. 18, 1993, the subject matter of which are herein incorporated by reference.
The present invention relates to a control operation of an disk array system, and in particular, to a data renewal method applicable to a disk array system adopting a level 5 system of Redundant Arrays of Inexpensive Disks (RAID), which will be described in detailed later.
Recent development of information processing requires a secondary storage system to attain high performance in a computer system. As a solution, there has been considered a disk array system including a plurality of disk devices (to be called drives herebelow) each having a relatively small capacity. Such an apparatus has been described in, for example, a thesis of D. Patternson, G. Gibson, and R. H. Kartz "A Case for Redundant Arrays of Inexpensive Disks (RAID)" proposed in ACM SIGMOD Conference, Chicago, Ill, (June, 1988).
In the article, there has been reported results of discussion on performance and reliability of disk arrays (level 3) in which data is subdivided for concurrent operations thereof and disk arrays (level 5) in which data is distributed for independent processing thereof.
Description will next be given of the level 5 disk array system in which distributed data is independently handled. In the level 5 system, each data item is not subdivided but is independently treated such that data items are distributed to be stored or written in a large number of disk drives each having a relatively small capacity. In a secondary storage apparatus of a mainframe system commonly used at present, each drive has a large capacity. Consequently, there frequently appears a case where a read/write request is issued to a drive while the drive is being occupied for another read/write request and hence the pertinent processing is obliged to enter a wait state due to the unavailable drive.
In a system of disk arrays of this type, the large-capacity drive used as the secondary storage of a mainframe computer system is replaced with many drives each having a small capacity such that data is distributed to be stored therein. Consequently, even when the number of data read and write requests is increased, such input and output requests can be distributed to the plural drives of the disk array system to achieve processing, thereby minimizing the chance in which processing enters the wait state for read and write requests due to unavailable drives. However, since a disk array includes many drives, the total number of constituent parts thereof and hence probability of occurrences of failures are increased. To cope therewith, namely, to improve reliability of the array system, a parity check is conducted in the data read and write operations.
FIG. 21 illustratively shows a data control procedure of the RAID proposed by D. Patterson et al. in the above thesis. There are shown internal data addresses of each drive in a level 5 disk array in which each data is distributed to a plurality of disk drives and the respective drives are handled independently of each other. Data items at the respective addresses form a unit of processing to be carried out for each read or write operation. In the processing, the groups of data items are handles independently of each other. Moreover, according to an architecture described in the article of RAID, the address is fixed for each data.
In this system, as already described above, it is indispensable to employ the parity check to improve reliability of the system. In the system, the parity data is created according to data at an identical address of each drive. Namely, the parity data is produced according to four data items at the same address (2,2) of the respective drives #1 to #4 and is then written at the address (2,2) of the drive #5 assigned for the parity storage. In this connection, according to the level 5 RAID, the parity data of all data is not necessarily stored in the drive #5. That is, for distributing each data, there are assigned a plurality of drives for the data and a drive for the parity code.
In the system, as in the mainframe system commonly used today, when conducting a data read or write operation for data of a drive, an access is made to the drive with specification of a physical address at which the data is written in the drive. Namely, as shown in FIG. 22, the physical address identifies a cylinder position to which a track in which the data is stored belongs, a position of the track in the cylinder, namely, a disk surface, and a position of the record in the track. Specifically, the address is represented with a (drive) number of the pertinent drive 12 in which the requested data is stored and CCHHR including a cylinder address (CC) denoting a cylinder number in the drive, a head address (HH) indicating a number assigned to a head 140 to select a track in the cylinder, and a record address (R).
In a disk array structured as above, in a case where a renewal or an update operation is desired to be conducted for data stored, for example, at an address (2,2) of the drive #3 shown in FIG. 21, there are conducted read operations for before-update or old data at the address (2,2) of the drive#3 and old parity data at the address (2,2) of the drive #5 assigned to store therein parity data related to the pertinent data (step (1)). These data items are exclusive-ORed with new data to produce new parity data (step (2)). After the parity data creation, the new data for the renewal is written at the address (2,2) of the drive #3 and the new parity data is stored at the address (2,2) of the drive #5 (step (3)).
As above, in the disk array, parity data is generated from the pertinent data such that the parity data is written in a drive in the same fashion as for the data. In this operation, however, the parity data is written in a drive other than the drivein which the data associated therewith is stored. As a result, at an occurrence of a failure in the drive in which the data is written, it is possible to rebuild or restore the data stored therein.
In the disk array system, like in the mainframe system commonly used at present, a storage location (address) of each data is beforehand fixed in the secondary storage system. Namely, when reading or writing data from or in the storage, the CPU accesses an address assigned to the data in advance.
In a level 5 disk array configuration, as shown in FIG. 23, in order to read old data and old parity data from drives respectively related thereto, there is necessitated a wait time of a disk latency, namely, a half period of time for one disk rotation. After the old data and the old parity data are read therefrom, new parity data is created. To write the new parity data in the drive, there is required a disk latency of one disk rotation. Resultantly, a total of the disk latency of 1.5 rotations of the disk is to be elapsed, which is quite a considerable overhead time.
To decrease the overhead time in the data write operation, there has been proposed in WO 91/20076 filed by the STK Corporation a dynamic mapping method of dynamically mapping or translating, after the data renewal, address information of a group (to be called a parity group) including other data parity data each having an address equal to that of the renewed data.
On the other hand, in the JP-A-4-230512, there has been described a renewal and recording method and a renewal and recording apparatus for use with direct access storage device (DASD) arrays. According to the method, a data block and a parity block of each parity group are distributed to the DASD arrays in a format which is not influenced from failures. Moreover, in each drive, there is reserved a free space or an unused space to be shared among a plurality of logical groups. In a first cycle, the old data block and the old parity block are read from the arrays. Based on the data and new data for renewal, there is computed parity data. In the unused space in which the old data and the old parity data have been held, the new data and the new parity block are stored through a shadow write operation.
However, when conducting the dynamic mapping described in WO 91/20076, there arise the following problems.
(1) When the dynamic mapping is carried out, an area in which the old parity group is memorized is remained as a space not to be used in the drive. Consequently, when it is desired to perform a dynamic mapping, there possibly occurs a disadvantageous case where it is impossible to reserve a new storage area for the dynamic mapping.
(2) When address information is controlled in the form of a table, the capacity of the table is increased, which increases the cost of the memory. Moreover, the address control operation is complicated and hence the processing overhead time is increased.
On the other hand, according to the method of the JP-A-4-230512 employing the unused space, the problem (1) cannot be considered to take place. However, as for the address control operation of Item (2) above, there has not been clearly described a method associated therewith. Furthermore, description has not been given of countermeasures to cope with increase in the number of drives and occurrence of a failure in the system.