1. Field of the Invention
The present invention relates generally to a disk array control program, method and apparatus for processing input/output requests from an upper apparatus, directed to a disk array system provided with a plurality of disk apparatuses having redundant configuration and, more particularly, to a disk array control program, method and apparatus for recovering an error detected in patrol processing of a disk apparatus by writing data from a normal disk apparatus.
2. Description of the Related Arts
Conventionally, in a disk array subsystem comprising a disk apparatus with RAID level redundant configuration, a patrol processing is performed mainly for detecting a medium defect in the disk apparatus. FIG. 1 is a conventional disk array subsystem 100, which is provided with a channel adaptor 102 connecting a host 101, device adaptors 106-1, 106-2 connecting a primary disk 108 apparatus and a secondary disk apparatus 110 which are in RAID level redundant configuration, and a central processing module 104 which is disposed between the channel adaptor 102 and the device adaptors 106-1, 106-2 to process I/O requests from the host 101. A patrol function for the disk apparatus is provided in the device adaptors 106-1, 106-2 and, when a patrol execution command is received from the central processing module 104, the device adaptors 106-1, 106-2 execute a patrol processing for the own primary disk apparatus 108 and secondary disk apparatus 110. The patrol processing of the device adaptors 106-1, 106-2 is performed by reading out block data for each logical block address from a targeting disk apparatus to a buffer on the device adaptor and by checking a check code included in the block data to determine whether an error exists or not. If the device adaptor 106-2 detects an error for the secondary disk apparatus 110, the central processing module 104 is notified of the error and the processing is terminated. When the central processing module 104 is notified of the error by the device adaptor 106-2, if the error is a recoverable error other than the medium defect, the central processing module 104 eliminates the error by reading out data at a logical block address corresponding to the error location from the primary disk apparatus 108 which is a normal disk apparatus and by writing the data into the secondary disk apparatus 110 from which the error is detected (see, e.g., Japanese Patent Application Laid-Open Publication Nos. 2003-36146 and 1992-285773).
However, in such error recovery processing based on conventional patrol processing, although an error can be eliminated if an error of a disk apparatus is detected by the patrol processing, a following problem occurs, for write processing of a primary disk apparatus and a secondary disk apparatus in response to a write request from a host, if a write failure occurs in the secondary disk apparatus and if data at another address are rewritten instead of data at a correct address, for example. For the primary disk apparatus 108 and the secondary disk apparatus 110 of FIG. 1, a data storage situation is shown as taken before the write failure occurs and identical data are stored at identical addresses. In this situation, when old data D1 are rewritten by new data D11 in accordance with a write request 112 from the host as shown in FIG. 2A, it is assumed that data D3 at another address are rewritten by the new data D11 in the secondary disk apparatus 110 due to an error 114, although the primary disk apparatus 108 is rewritten normally. In the situation of FIG. 2A, if the patrol processing is performed in the secondary disk apparatus 110, a check code abnormality 116 is determined for the data D11 as shown in FIG. 2B and recovery processing 118 is performed for eliminating the error by reading out correct data D3 corresponding to the error location from the normal primary disk apparatus 108 and by writing the data into the secondary disk apparatus 110. However, in the secondary disk apparatus 110 in which the rewrite has not been performed correctly in FIG. 1A, the data D1 remains unchanged as the old data before the rewrite, resulting in a lost write 120, and does not consist with the new data D11 in the primary disk apparatus 108, and since redundancy of the data is still lost in this portion, a problem occurs that the redundancy can not be recovered by the conventional recovery processing.