1. Field of the Invention
The present invention relates to a disk array subsystem for connecting and operating a plurality of disk drives in parallel and a data generation method therefor and particularly to a disk array subsystem for realizing a reduction in a write penalty in RAID4 (redundant array of inexpensive disks) and RAID5.
2. Description of the Prior Art
A disk array subsystem is a magnetic disk drive for connecting and operating a plurality of inexpensive small disk drives in parallel so as to realize performance equivalent to that of a SLED (single large expensive disk). A general arrangement of a disk array is shown in FIG. 7. This disk array comprises a disk array controller 2 which is connected to a host computer 1 via a host interface 12 and a plurality of disk drives 18 which are connected to the disk array controller 2 and operate in parallel. The disk array controller 2 comprises a host interface controller 3 for storing a read or write instruction from the host computer 1 once, a CPU 4 for controlling the operation of the disk array controller 2, a cache memory 6 for storing data transferred between the host computer 1 and the disk drives 18, a cache memory controller 5 for controlling the cache memory, and a disk controllers 7 for controlling data transfer between the disk array controller 2 and the disk drives 18.
When the cache memory controller 5 confirms at the time of reading that requested data exists in the cache memory 6, the data is transferred from the cache memory 6 to the host computer 1 via the host interface 12. When the requested data does not exist in the cache memory 6, the CPU 4 stores the data in the cache memory 6 from a disk drive 18 storing the data via the disk controller 7 and the cache memory controller 5. The cache memory controller 5 transfers the data to the host computer 1 after the storing ends or in parallel with the storing.
At the time of writing, write data transferred from the host computer 1 is stored in the cache memory 6 by the cache memory controller 5 via the host interface 12 and the host interface controller 3. The cache memory controller 5 writes the write data into the disk drive 18 designated by the CPU 4 via the disk controller 7 after the storing ends or in parallel with the storing.
To maintain reliability, the disk array subsystem generates parity on data stored on a plurality of data disks and stores it on a parity disk. In RAID4, the parity disk is fixed in a special disk drive. In RAID5, for avoiding performance reductions caused by access concentration on the parity disk, the parity is distributed evenly to all the disk drives for each data.
U.S. Pat. No. 5,191,584 discloses a data updating method in a disk array subsystem of RAID4 or RAID5. According to this data updating method, when a disk array controller accesses one data disk by each processing unit, there is no need of accessing all the data disks, even when writing data. The disk array controller calculates new parity data by the exclusive OR of the old data of the data disk for writing, the old parity data of the parity disk, and new data transferred from the host computer and updates the parity disk according to the new parity data. Therefore, another process can be executed for disks other than the data disk for writing and the parity disk. Particularly in RAID5, no parity disk is specified, so that the write process can be executed at the same time.
A problem caused by this method is that the five processes indicated below are generated because the parity disk is updated for writing data as shown in FIG. 8 and the processing capacity is lowered.
1) Reading the old data from the data disk at the corresponding address
2) Reading the old parity data from the parity disk at the corresponding address
3) Writing new data on the data disk at the corresponding address
4) Calculating the exclusive OR of the new data, old data, and old parity data and obtaining new parity data
5) Writing the new parity data on the parity disk
The four processes except 4) among the aforementioned five processes are accompanied by an access to disk drives process and it causes a performance reduction of the disk array subsystem. This performance reduction due to increase in access to disk drives processes for updating of the parity disk which occurs when writing data, is called a write penalty. Conventionally, one manner of improving such write penalty is known as the use of pseudo-parity which works as follows. In case a plurality of disk drives have a common parity, when one of the disk drives is written by a new data, the corresponding new parity data, i.e., a parity data obtained by calculating the exclusive OR of the new data and the corresponding old data is called a pseudo-parity of the new data. The pseudo-parity works as a new parity data which is common to all the plurality of the disk drives corresponding to the new data, and therefore is called a pseudo-parity.