Hitherto, a disk array apparatus provided with a plurality of disks has been used as a data input and output apparatus in some cases. In such a disk array apparatus, the disks are subjected to redundancy. Accordingly, even if a predetermined number of disks are broken, data is not lost. Furthermore, access paths to the disks in the disk array apparatus are also subjected to redundancy. Even if a predetermined number of access paths are disabled, the disks may be accessed.
The redundancy of access paths will now be described. The disk array apparatus includes a control module that controls various processes, such as a read process. The control module is connected to a disk through a device adapter (hereinafter, “DA”) that controls an access path to a disk. Specifically, when the control module is connected to disks through a plurality of DAs, access paths to the disks are subjected to redundancy. When receiving an access request from an upper apparatus, the control module selects an access path to access a disk.
As the number of disks provided for the disk array apparatus increases, the number of DAs increases in proportion thereto. In particular, as a maximum storage capacity of such a disk array apparatus has increased to the order of petabytes, the number of disks has increased. This leads to an increase in the number of DAs. The increase in the number of DAs results in an increase of the risk of DA failure. Typically, DAs provided for the disk array apparatus are periodically subjected to status monitoring, e.g., an operation check. The statuses of the DAs are monitored at any time in order to early detect a DA in an abnormal condition and normalize the status of the DA.
Recently, there have been proposed techniques of determining whether or not command processing time exceeds a predetermined threshold value to degrade (reduce) components constituting the disk array apparatus.
For example, reference documents are Japanese Laid-Open Patent Publication Nos. 2004-252692 and 2000-89984.
According to the above-described techniques, however, a performance failure may not be detected appropriately. Specifically, although an operation failure or a process failure may be detected according to the above-described techniques, a performance failure, e.g., an abnormal performance value of a DA, may not be detected. The reason is as follows. As for a DA having a performance failure, its performance is lower than the normal one but the DA may operate with the performance which is not recognized as being in an abnormal condition. In the case where the DA operates while being not recognized as failed, even when data access delay due to the performance failure occurs, a process which will be completed within a prescribed period may not be finished within the period. That is, a system may not operate as intended. It is therefore important to detect a DA having a performance failure.
Even in the use of the above-described techniques of determining whether or not command processing time exceeds the threshold value to degrade DAs which serve as components, a DA performance failure may not be detected appropriately. Specifically, whether a DA has a performance failure may be inherently determined in consideration of another point of view, for example, whether the DA is temporarily under high load due to an external factor. However, it is complicated and difficult to make such a determination. It is difficult to appropriately detect a DA having a performance failure using the above-described techniques.