1. Field of the Invention
The invention relates to a device array system in which logic groups having redundancies are constructed by a plurality of input/output devices such as disk drives or the like, enabling data to be corrected when a device has failed and, more particularly, to a disk array apparatus which fetches a failure alternating device from another logic group having a high redundancy for a device failure, recovering the redundancy.
2. Description of the Related Art
In the related art, RAID (Redundant Array of Inexpensive or Independent Disk Drives) levels, disclosed in a thesis by David. A. Patterson et al. of University of California at Berkeley, 1987, is a disk array system in which a large amount of data is accessed by many input/output devices, such as disk drives, at a high speed, and which provides for a redundancy of the data when a device has failed.
RAID is explained in The RAIDBook, A Source Book for RAIDTechnology, Edition 1--1, published by the RAID Advisory Board, St. Peter, Minn., Nov. 18, 1993.
The RAID Levels are classified to levels from RAID-1 to RAID-5. RAID-1 relates to a mirrored disk in which the same data is stored in two disk drives. In RAID-2, data and a Hamming code to recover data in a failed drive are stored in several disk drives. In RAID-3 to RAID-5, parities are stored in order to recover data when the disk drive has failed.
According to RAID-4 and RAID-5, a plurality of disk drives can be simultaneously accessed and data read therefrom. In RAID-5, further, since the disk drive storing the parity is not fixed, data can be also simultaneously written to a plurality of disk drives, the effect of which is apparent in a large amount of transaction processes. In the above-mentioned RAID systems, when one of the disk drives has failed and if the failed drive is continuously used as it is, a problem arises such that when another disk drive fails, data is lost.
Therefore, when one disk drive has failed, data stored on the failed disk drive is immediately reconstructed and stored on a normal (or functioning) disk drive which is newly added by using a data reconstructing function based on the redundancy of RAID, so that the redundancy of data is restored. For this purpose, a disk drive called a hot spare, which is used for data correction at the time of failure, is installed in the array of the disk array apparatus. A hot spare is a disk drive which is powered up and ready to replace a failed disk drive without powering down the disk array system.
However, if a hot spare is provided in the disk array system, monitoring whether the hot spare is normal is always necessary and a hot spare patrol is provided for this purpose.
According to the hot spare patrol, a verifying command is generated and transmitted to the hot spare, for example, once an hour, and a verifying process of the entire surface of the disk of the hot spare is executed, for instance, for once a month. Since writing and reading circuits of the hot spare cannot be verified using only the verifying command, the reading and writing operations of a plurality of block data are also executed to the hot spare.
The above-mentioned hot spare patrol is, however, asynchronously and transparently performed with respect to a host computer accessing operation. When accessing the hot spare, the operation may collide with the host accessing operation to another disk drive connected to the same port. When the collision occurs, the host accessing operation is made to wait until the accessing operation to the hot spare is finished, which deteriorates accessing performance.