In a storage device, a plurality of Controller Modules (CMs) are connected to each other via a Peripheral Component Interconnect (PCI) Express Bus. To duplicate data and to exchange control information, an inter-CM communication is performed between the plurality of CMs. The inter-CM communication may fail depending on the state of communication paths. In that situation, in order to properly perform a recovery process, the CM provided on the communication starting side needs to correctly understand whether the communication was normally performed or not for each command. The CM is able to understand whether the communication was normally performed or not, by checking to see if a DMA controller thereof had a normal termination or checking to see if a switch thereof has properly made a transfer to the outside.
First, an abnormality detection in communication paths between CMs will be explained, with reference to FIG. 8. FIG. 8 is a drawing for explaining the abnormality detection in the communication paths between the CMs. In the example illustrated in FIG. 8, it is assumed that a CM 0 is the CM provided on the communication starting side. The CM 0 includes a memory, a Direct Memory Access (DMA) controller, a Central Processing Unit (CPU), and a switch. The switch may be a PCI Express switch, for example.
First, the DMA controller instructs the switch to perform, on the data stored in the memory, a write transfer to a memory of another CM (step S101). The switch receives the instruction to perform the write transfer to the memory and transmits a response indicating normal (hereinafter, a “normal response”) in response to the write transfer to the memory (step S102). The normal response only guarantees the receipt of the request to the other CM and does not guarantee that the other CM has completed the writing process into the memory. The DMA controller receives the normal response and notifies the CPU of a normal termination interrupt (step S103).
The switch receives the instruction to perform the write transfer to the memory and performs the write transfer to the memory of the other CM (step S104). In this situation, let us assume that the switch fails the write transfer. In that situation, the switch transmits a response indicating that the write transfer was abnormal (step S105). The DMA controller, however, is not able to recognize that the write transfer was abnormal.
The CPU is notified of the normal termination interrupt and reads, from each of the devices on the paths in the switch, information indicating whether any abnormality has occurred in the paths between the CMs (step S106). Even if the reading process by the CPU is performed at a point in time when the write transfer to the memory in the other CM has not yet been completed, the CPU is able to read the information indicating whether an abnormality has occurred in the paths between the CMs, the information indicating the result of the completion of the write transfer to the memory. The reason is that the PCI Express switch is configured to perform the reading process after the writing process into the memory is performed.
Patent Document 1: Japanese Laid-open Patent Publication No. 2012-133405
Patent Document 2: Japanese Laid-open Patent Publication No. 2012-48712
However, a problem is observed where, when the write transfer to the memory of the other CM is performed, the entire performance within the CM is degraded. For example, because the CPU reads the information indicating whether an abnormality has occurred in the paths in the switch, the CPU is being used during the reading process and is not able to perform any other processes. As a result, the entire performance within the CM is degraded.