In a conventional disk array apparatus, abnormality monitoring based on statistical score addition processing has been performed for both disks and routes so as to degenerate abnormal disks and abnormal routes (see for example patent document 1).
FIG. 1 illustrates an exemplary configuration of a conventional disk array apparatus. A disk array apparatus 101 in FIG. 1 includes a controller module (CM) 111, a CM 112, and a disk enclosure (DE) 113. The DE 113 includes an input-output module (IOM) 121, an IOM 122, and disks 131-133.
The disks 131-133 are, for example, hard disk drives (HDDs). The CMs 111 and 112 and the IOMs 121 and 122 are route components included in routes from the CM 111 or 112 to the disks 131-133.
The CMs 111 and 112 are each connected to the IOMs 121 and 122. The IOMs 121 and 122 are each connected to the disks 131-133. The CMs 111 and 112 access the disks 131-133 via the IOM 121 or 122 and write or read data.
FIG. 2 illustrates an exemplary statistical information table held by the CM 111 of the disk array apparatus depicted in FIG. 1. The statistics in the statistical information table depicted in FIG. 2 indicate error occurrence statuses of individual monitoring-target components.
CM#0 and CM#1 are respectively identification information of CMs 111 and 112, and IOM#0 and IOM#1 are respectively identification information of IOMs 121 and 122. DISK#0 to DISK#2 are respectively identification information of disks 131 to 133.
When access to any of disks 131-133 fails, CM 111 adds an additional value that depends on an error factor to the statistic of a corresponding monitoring target. When the statistic becomes higher than a threshold during a monitoring period, CM 111 determines that a disk fault or route fault has occurred, and degenerates the corresponding monitoring target. This operation logically disconnects the monitoring-target component from the disk array apparatus, and this component becomes unused.
An apparatus is known that predicts a failure in a storage device (see for example patent document 2). That apparatus associates a technology descriptor with the storage device, sets a predictive failure threshold for the storage device, and detects a storage device error that exceeds the predictive failure threshold.
A method is also known for maintaining data integrity in storage devices (see for example patent document 3). That method detects impending data errors such as a track squeeze problem in a magnetic disk drive, and repairs the problem by rewriting the affected tracks.
Patent document 1: Japanese Laid-open Patent Publication No. 2009-205316
Patent document 2: Japanese Laid-open Patent Publication No. 2007-200301
Patent document 3: Japanese Laid-open Patent Publication No. 2005-322399