The present invention relates to a storage system used in computer systems. More particularly, the present invention relates to a storage system having a disk array of multiple disk drives having a two head configuration for reading data from a sector, operating on the data and writing the operated on data back to the sector.
Conventionally, it is well known to use a disk array which employs a plurality of small-capacity disk drives as a memory for high performance and high reliability. The disk array consists of two or more disk drives that work as a single disk drive for the computer. Such a disk array has been disclosed in a publication entitled "A Case for Redundant Arrays of Inexpensive Disks (RAID), " ACM SIGMOD conference proceeding, Chicago, Ill., Jun. 1-3, 1988, pp.109-116" by D. Patterson, et al.
The above-described publication by Patterson, et al. discloses, in particular, a technology concerning a data distributing layout on the disk array. The paper proposes some data distributing layout methods, of which a representative one is to arrange records, which constitute data units for reading and writing from a processing device, as is on the disk drive. In the following, such data arrangement is called a distributing layout by record. In Patterson the distributing layout by record is treated as RAID4 and RAID5. One of the advantages of the distributing layout by record is the ability to perform the read/write operation for each of the disk drives independently that form the disk array. Thus, the distributing layout method by record improves the degree to which multiple read/write operations can be performed simultaneously in the disk array, thereby enhancing the overall performance of the disk system.
Higher reliability of the disk array can be achieved by adopting a redundant configuration. In the distributing layout by record, for one data record stored in each of multiple disk drives, a parity record consisting of one record of parity data is generated and stored as one record in each disk drive. A set of a parity record and data records for which the parity record was generated is called a parity group. Normally, the records in the same parity group are stored in different disk drives. One parity group may contain two or more parity records.
When one of the data records for which a parity record was generated should fail, the contents of the failed data record can be recovered from the parity record of a parity group to which the faulty data record belongs, and from the remaining data records in the parity group. Thus, if an error occurs in any of the disk drives in which a parity group is stored, the lost data can be reinstated. Normally, by setting the number of parity records in a parity group to n, it is possible to recover data in the parity group if failures occur in up to n disk drives.
With the distributing layout by record, when the contents of a part of data records in a parity group are changed by write operation, the parity record needs to be updated. The updated value for the parity record should include, in addition to updated values for the data records being written, a set of old values, before updating, of the data records being written and of the parity record, or a set of values of the remaining data records in the same parity group to which the data records being written belong.
Acquisition of these sets of values is performed as a preprocessing prior to writing updated values into the data records that are subjected to the write operation. An overhead of the preprocessing is smaller for acquiring the former data, i.e. the old values of the data records being written and the parity record than for acquiring the latter, i.e. the values of the remaining data records in the same parity group. Hence, when a write operation takes place, the preprocessing generally acquires the old values of the data records being written and the parity record. In other words, in the conventional disk array that adopts the distributing layout by record, the data record updating procedure consists of the following operations in the order of occurrence: reading the old data records and the old parity record before being updated; generation of a parity record after the data records are updated; and writing the updated data records and the updated parity record.
In a disk array apparatus that has parity data in the distributing layout by record, when a write operation is performed, a parity record of the same parity group to which the data records being written belongs is updated. At this time, when we look at the disk drives that contain a parity record of the same parity group to which the data records being written belong, these disk drives perform a read operation to read an old value of the parity record and a write operation to write a newly generated parity data in order to update the parity record. Because the reading of the old parity data and the writing of the updated parity data are performed to the same storage medium in the disk drive, that is, on the same area on the disk, the writing of the updated parity data is done after the disk has turned at least once following the reading of the parity data.
On the other hand, as to the disk drives in which the data records that are subjected to the write operation are stored, old data record values are read out prior to writing the updated data records in order to update the parity record as well as to write the updated data records. As in the case of the parity record, the reading of the old data records and the writing of the updated data records are performed to the same area on the disk. Hence, the updated data record writing operation must wait for the disk to turn at least one time after the reading of the old data.
In this way, with the conventional disk arrays, the disk latency is essential for the data record updating operation to be carried out. For this reason, performance may be worse in some cases than that of ordinary disk drives.