The present invention relates to a method for controlling a control unit with cache memory for a disk array and a storage unit subsystem which is composed of an array of disks ad a control unit with cache memory.
The prior art most relevant to the present invention is David A. Patterson et al, "A Case for Redundant Arrays of Inexpensive Disks (RAID)", ACM SIGMOD Conference Proceeding, Chicago, Ill., Jun. 1-3, 1988, pp. 109-116.
The Patterson et al's article discloses a technique concerning the distribution of data on a disk array.
A disk array is physically composed of a plurality of small scale disk units but it is a disk system which operates as one disk unit for a processor. Namely, the disk array is a mechanism for attaining performance improvement and high reliability.
The Patterson et al's article proposes some data distribution methods. According to a typical one of the disclosed data distribution methods, a record as a read/write unit for a processor is distributed on a disk unit as it is. In the invention, the distribution will hereinafter be referred to as data distribution by record. The Patterson et al's article also proposes a data distribution method in which one record is divided into a plurality of data and the individual data are distributed on a plurality of disk units, respectively. This distribution is referred as RAID4 or RAID5. A feature of the data distribution by record lies in that a read/write process can be independently performed for each of disk units which constitute the disk array. In the case where one record is divisionally distributed on a plurality of disk units, a read/write process for one record monopolizes the plurality of disk units. Accordingly, in the case of the data distribution by record, the concurrency of read/write processes capable of being performed in the disk array is improved, thereby attaining the improvement in the performance of the whole of the disk array.
On the other hand, the high reliability of the disk array is realized in such a manner that redundant data called parity data are stored in disk units. Hereinafter, a record storing data read/written by a processor will be referred as a data record, and a record storing redundant data will be referred as a parity record. In the data distribution by record, a parity record is generated from a group of data records each of which is stored in each disk unit in a disk array. An assembly of a parity record and data records from which the parity record is generated, is termed a parity group. Usually, records in the same parity group are stored in separate disk units. One parity group may include one or more parity records.
In the case where an error occurs in any one of the data records from which a parity record was generated, the content of the faulty data record is recovered from the contents of the parity record and the other data records. Accordingly, even if an error occurs in any disk unit in the assembly of disk units in which a parity group is stored, data can be recovered. Usually, if the number of parity records in one parity group is n, data in the parity group can be recovered even if errors occur in as many as n disk units.
In the case of the data distribution mentioned above, the updating of a parity record becomes necessary each time the content of a data record is changed by a write process. Therefore, the performance of a write process is degraded as compared with the conventional disk device. In addition, the determination of an updated value of the parity record needs a preprocess for obtaining one of the following sets (1) and (2) of values:
(1) the old values (hereinafter update before values) of a data record made the object of the write process and the parity record; and
(2) the values of other data records in a parity group to which a data record made the object of the write process belongs.
The values mentioned by (1) can be acquired with small overhead. Therefore, in the case where the write process occurs, a method of acquiring the values mentioned by (1) is usually employed. In order to read the values mentioned by (1), disk unit access must be made twice even in the case where only one parity record is included in the parity group. Further, in order to write the updated value of the data record made the object of the write process and the updated value of the parity record, disk unit access must be made twice. Accordingly, the disk access that is required is four times in total. In the case of the conventional disk, on the other hand, it is only required that the updated value of a record made the object of a write process should be written into a disk unit. Namely, the number of disk accesses required for a write request in the disk array using the data distribution by record is four times of that in the conventional disk.
There is not known a technique concerning the speedup of a write process in the disk array which uses the data distribution by record. But, the following techniques are known as techniques for the speedup of a write process in a general disk unit.
JP-A-55-157053 discloses a technique for improving the speed of a write request in a control unit having a disk cache by using a write after process. The control unit completes a write process at a stage of time when write data received from a processor is written into the cache. Thereafter, the data stored in the cache is written into a disk unit through a write after process by the control unit.
JP-A-59-135563 discloses a technique concerning a control unit which makes the speedup of a write process while ensuring high reliability. The control unit has a nonvolatile memory as well as a cache memory so that write data received from a processor is stored in the cache memory and the nonvolatile memory. The write data is written into a disk unit through a write after process by the control unit. Thereby, the high reliability of the write after process is attained.
JP-A-60-114947 discloses a technique concerning a control unit which controls disk units for double-write and has a cache memory or disk cache. When receiving a write request from a processor, the control unit writes write data received from the processor into one disk unit and the cache memory. In asynchronism with a read/write request from the processor, the control unit writes the write data stored in the cache memory into the other disk unit later on.
JP-A-2-37418 discloses a technique for attaining the speedup by applying a disk cash to disk units for double-write. A control unit has a nonvolatile memory as well as a cache memory so that write data received from a processor is stored in the cache memory and the nonvolatile memory. The control unit writes the write data into two disk units through a write after process.
JP-A-3-37746 discloses a technique concerning a control unit which has a disk cache and performs a write after process, or more particularly, a technique concerning a management data structure of write after data in the disk cache which is intended for efficient execution of the write after process in such a control unit.
Each of the above prior arts disclosing a write after process using a disk cache (hereinafter simply abbreviated to cache) for an usual or conventional disk unit shows a simple technique by which write data received by the cache from a processor is written into the disk unit. However, in the case of a disk array using the data distribution by record, it is necessary to generate the updated value of a parity record. Therefore, the overhead for a write process becomes large as compared with the conventional disk unit. Accordingly, how to generate the updated value of the parity record furnishes a key for the speedup of the write process in the disk array using the data distribution by record. On the contrary, in the conventional disk unit, such consideration is unnecessary since the updated value of a parity record is not required.