1. Field of the Invention
The present invention relates to a technology for recovering a failure in a recording apparatus having a bad-sector detecting function by executing a reading test in each sector.
2. Description of the Related Art
A magnetic disk apparatus having a diagnostic/monitoring function called SMART (Self Monitoring Analysis And Reporting Technology) has been known (see, for example, Japanese Patent Application Laid-Open No. 2003-233511). The SMART function obtains an error frequency indicating the number of errors occurred in a predetermined time on a stead basis, decides that a failure has occurred if the obtained value exceeds a threshold, and makes a report.
Use of the SMART function makes it possible to predict a fatal failure (unrecoverable failure) that will possibly occur to the magnetic disk apparatus, and thus to implement preventive measures such as backing up the magnetic disk apparatus.
The SMART function is also capable of executing so-called “self-test” of reading all the sectors and recording detected bad sectors in a pending list (bad sector list) during the off-time with less disk access.
However, it is difficult to decide the fault of the magnetic disk apparatus based on various information obtained by the SMART function and the result of the self-test. Namely, there are sectors that are temporarily unreadable due to vibration of the apparatus or mechanical wobbling among the sectors recorded in the bad sector list by the self test. On the other hand, there are also sectors that are permanently unusable due to a damage to a recording medium (disk) among the sectors recorded in the bad sector list.
Therefore, it is difficult to distinguish temporary failures from permanent failures on the side receiving a report of bad sectors from the SMART function (for example, an operating system). Therefore, there has been a problem that a recovery could not be appropriately implemented to the fault. For instance, phenomena have been often seen that despite the operating system was once being decided to be temporarily faulty, a permanent failure occurred with subsequent data access, and that despite replacement of the magnetic disk apparatus after decision of a permanent failure, the failure turned out to be temporary.
From this point of view, it is a major problem how to achieve a failure recovering method capable of an appropriate failure recovery using the result of implementation of the SMART function. This problem arises not only to a single magnetic disk apparatus but also to a disk array apparatus including a number of magnetic disk apparatuses.