1. Field of the Invention
The invention relates generally to a storage apparatus and a method for controlling the storage apparatus, and is suitable for use in, for example, a storage apparatus in which SATA (Serial AT Attachment) disk drives are used as storage devices.
2. Description of Related Art
In recent years, the demand for increased capacity and reduced cost in storage apparatuses has been increasing. In this situation, many kinds of storage apparatuses, each of which is provided with an inexpensive and large-capacity SATA disk drive as a storage device, have been coming out recently.
But an SATA disk drive has a problem of being low in reliability and being likely to cause data loss, compared with an FC (Fibre Channel) disk drive. However, the problem can be considered as being handled to some extent by using the techniques such as correction read and correction copy disclosed in, e.g., JP11-191037 A.
Incidentally, the number of SATA disk drives provided in one storage apparatus has recently continued to increase, and accordingly, the frequency of replacement of SATA disk drives in which a fault has occurred has been increasing.
However, when a storage apparatus is provided with SATA disk drives or SAS disk drives as storage devices, a problem arises in that much time is required for the work of replacing an SATA disk drive or SAS disk drive in which a fault has occurred, because the SATA disk drive or SAS disk drive has a low data transfer speed and a large capacity.
In the case of a 750 Gbyte product, for example, 40 hours are required for the so-called collection copy, in which the data in the faulty SATA or SAS disk drive is restored in a spare disk drive by using parity data, and 40 hours are required also for the so-called copy-back, in which the data stored in the spare disk drive is copied to a new post-replacement SATA or SAS disk drive. Therefore, for replacing one SATA disk drive, 80 hours are required in total.
Also, during the above time, a fault is not allowed to occur in other disk drives that constitute an ECC (Error Correction Code) group together with the faulty SATA or SAS disk drive. For example, when a fault occurs in another SATA disk drive in the ECC group during recovery work, the problem of occurrence of a double fault, which leads to an irreparable state may arise. Note that the ECC group indicates a group composed of disk drives that store, when dividing data to be stored in the disk drives by utilizing RAID (Redundant Array of Inexpensive Disks), the divided data, the parity data calculated based on the divided data, or both data.
Moreover, during the above-described collection copy or copy-back, the load resulting from the fault is place on the other disk drives and spare disk drive that constitute the ECC group together with the faulty SATA or SAS disk drive and paths. Therefore, response performance to access may be reduced.