Conventionally, a storage system is disclosed in which plural magnetic disk devices (the magnetic disk device will be hereinafter referred to as “disk”) are mounted to accumulate a tremendous amount of data.
FIG. 1 illustrates a storage system 1000. The storage system illustrated in FIG. 1 aims to enhance redundancy of data by constructing RAID (Redundant Arrays of Independent (Inexpensive) Disks), thereby supplying a desired performance characteristic to a host.
The modules illustrated in FIG. 1 are described hereunder.
A CA (Channel Adapter) 200 is a module for controlling an interface to Host 202. When CA 200 receives a data Write/Read operation request from Host 202, CA 200 notifies a processing request to CM 203 (Centralized Module) or directly accesses Cache Memory on CM 203 to perform data transfer between CM 203 and Host 202.
CM is a module serving as a central core for the modules illustrated in FIG. 1. Firmware-mounted control modules such as a CACHE control module 201, a Backend control module 204, etc. exist in CM.
The CACHE control module 201 performs allocation management and entire control concerning a memory area in CM. The BackEnd control module has small modules for performing the control of Fibre Channel Interface for communicating with disks, I/O control of disks, the control of RAID constructed by plural disks, etc. Each control may be executed on the basis of FCMAP as a managing table. In FIG. 1, the module for performing the control of Fibre Channel Interface and the I/O control of disks is represented as FC/Disk Driver 205.
Furthermore, an FC controller, e.g., QX4 for communicating with disks may be mounted in CM.
BRT (BackEnd Router) is a module having a fabric function of supplying a communication path between CM and DE (Driver enclosure). BRT will be described with reference to FIG. 2.
Eight ports of interface to CM exist in BRT 206, and thus BRT can connect to eight CMs at maximum. Eight ports of interface for connecting to DE 207 exist in BRT, and thus BRT can connect to forty eight DEs (8 buses×6 cascades) at maximum. Eight BRTs at maximum are mounted in this storage system. Two BRTs are mounted with being paired at all times to make the bus to DE redundant.
The fabric is one system for connecting a target (a disk in this embodiment) and an initiator (CM in this embodiment) to each other, and it may be defined as a network using a fibre channel switch or a network in which fibre channel switches are mutually connected to each other. When the fabric function is mounted, it is unnecessary that the initiator and the target port are directly connected, and thus extendibility is excellent. Furthermore, when a command is transmitted, the transmission is performed through only a device having the fabric function, and thus the fabric function exercises a higher command processing performance as compared with an AL-PA system for connecting plural targets in a loop style.
Returning to the description of the internal modules of the storage system shown in FIG. 1, DE is a device having a disk mounted therein, and DE will be described with reference to FIG. 3. DE can be designed so that fifteen disks at maximum can be mounted per DE. DE has two PBCs 208 (Port Bypass Circuits) mounted thereon, and it can be cascade-connected to another DE through PBC. In the following description, DE connected directly to the initiator (BRT in this embodiment) will be referred to as basic DE and DE cascade-connected from the basic DE 209 will be referred to as “extended DE” 210 as necessary.
Two PBCs may be mounted per DE, and each PBC has FCC (Fibre Channel Controller) mounted therein and has a role of transferring an FC packet from CM to an indicated disk or from an indicated disk to CM.
The relative terms of the storage system will be described hereunder.
PLU is information representing a mount position of a disk in the system which is managed in CM, and it is represented by “DE No.” and “Solt No.”.
Loop ID is identification information of each device connected to FC loop which is a loop constructed by fibre channels. Each device connected to FC loop is allocated with an address inherent in the loop. This address is represented by AL_PA. AL_PA is not a serial numerical value, and thus it is difficult to be handled in some cases. Accordingly, logically serial numerical values are allocated to AL_PA. These serial numerical values are loop IDs.
In the storage system and the upstream modules, the position information of disks is handled in PLU. However, when a command is actually issued to a disk, loop ID which is paired with PLU indicated form FCMAP is used, and a command is issued to the disk.
FCMAP is a tabled list of information of disks connected to FC loop. FCMAP is created when LIP (Loop Initialize Primitive) occurs on FC loop. The following information may be stored in FCMAP.                Loop ID corresponding to PLU        Loop ID of FCC        others, information concerning disk.        
LIP is initializing processing executed to recognize the identification information of a device connected to FC loop so that the device is usable. A LIP start command is issued from Loop Master (CM in this embodiment) onto FC loop, whereby LIP is executed. LIP is executed as needed when the construction is changed at each port (FC loop) in BRT.
The storage system discriminates the state of FC loop as follows.                Linkup        LIP is completed, and thus it represents a state that the FC loop concerned is usable.        Under LIP        
LIP is being executed, and thus it represents a state that the FC loop concerned is unusable. Furthermore, in the case where LIP is not completed even when a fixed time elapses from the start time of the LIP processing (LIP Timeout), abnormality of hardware or the like may occur, and thus this storage system sets the corresponding port to an unusable state.                Linkdown        
It represents a state that FC loop is unusable because LIP has not yet been completed.
When LIP simultaneously occurs in plural FC loops connected to the same BRT, the completion of LIP is delayed due to the limit of the processing performance of BRT.
Furthermore, when power failure occurs in extended DE, the storage system determines occurrence of power failure under the following conditions.
(1) LIP of DE in which power failure occurs is completed, and the state of the FC loop concerned is Linkup.
(2) When a disk mounted in power-failure occurring DE is accessed, the loop ID of the disk concerned is vanished in FCMAP.
(3) FCC mounted in the DE concerned is vanished from FCMAP.
However, when LIP occurs at plural ports (FC loop) connected to the same BRT due to power failure of the extended DE, it takes time due to the limit of the processing performance in BRT as described above until FC loop of power-failure occurring DE is set to Linkup. As a result, LIP is timeout, so that DE power failure cannot be determined in some cases.
When power failure can be detected, for example data on Cache could be temporarily evacuated into a disk (disk in basic DE) to thereby prevent data loss. However, when power failure cannot be detected, this action could not be performed.
The trouble operation described above will be described with reference to FIG. 4. The basic DE and the extended DE may be mounted in different racks, and the probability that extended DEs having the same number of cascade-connection stages may be mounted in the same rack increases due to the wire cable length although this is varied in accordance with the arrangement status of the storage system. Furthermore, each rack-mounted device is supplied with power from a common power supply unit of the rack. Accordingly, when power failure occurs, it should be noted that power is interrupted every rack. That is, the power supply to the respective devices mounted in the same rack is interrupted at the same time. In the example of FIG. 4, it is assumed that DEs grouped by a broken line 401 are mounted on one rack. Furthermore, it is assumed that a group indicated by a one-dotted chain line 402 are one FC loop. Main units may aggregate at one place to construct a system. When the system is constructed based on this policy, CM and the basic DE may be mounted in the same rack, and thus CM and the basic DE are supplied with power from a common power supply unit on the rack. As illustrated:
410: Power failure occurs in a rack in which an extended DE directly-connected to the basic DE is mounted.
420: Since power failure of DE occurs over plural FC loops, LIP occurs in the plural FC loops connected to BRT. The completion of LIP is sluggish (delayed) due to the limit of the processing performance of BRT.
430: As a result of the delay of the LIP completion, the overtime of the LIP monitoring time is detected, and LIP Timeout is set, so that the bus is closed.
440: CM tries to issue a command to a disk mounted in power-failure occurring DE. However, CM cannot determine that DE power failure occurs because the bus is closed in “3”.
450: When power failure cannot be detected, it is impossible to access an accessible base DE, and thus data loss occurs (backup failure).