In general, in an information processing system having a plurality of information processing devices through a network that transfers data, when data transfer is requested from higher-level software, such as application software executed on each information processing device, the requested data is divided into a plurality of packets and transferred.
Then, in transfer of data divided into a plurality of packets, if the last packet reaches the destination node, the destination node will recognize that transfer of the data has been completed, from additional information included in the last packet. Thereafter, the destination node notifies the higher-level software of completion of one data transfer.
Additionally, in large-scale information processing systems, notably parallel computers such as supercomputers, remote direct memory access (RDMA) is adopted in many cases. The RDMA refers to a function of directly transferring memory data at one compute node into a memory at another compute node by using two network controllers. Using RDMA makes it possible to achieve communication with high throughput and low latency. In particular, RDMA protocol communication utilizing Ethernet (registered trademark), which is a network used for transmission control protocol/Internet protocol (TCP/IP), has been becoming available in recent years. For this reason, an increasing number of systems have been adopting RDMA.
When RDMA is adopted, a scheme in which a packet is resent at the link level of a network is often used as the scheme of guaranteed delivery of packets. In this case, since the delivery of packets is guaranteed in the network area, there is no measure for detecting and retransmitting a packet that has been discarded because of a bit error and so on between nodes for transmitting and receiving packets, for example.
One of the exceptions of guaranteed delivery at the link level of a network is the case where a link-down has occurred through failure of the hardware. Generally, regarding a link-down that occurs through failure of hardware, a period of time taken until the normal operation is restored is not guaranteed. For this reason, when a link-down has occurred, an information processing system discards a packet that is going to pass through a point where the link-down has occurred, in order to inhibit the packet from staying in the network. Such a link-down error is detected as a hardware error of the network by a device monitor system disposed external to the information processing system.
By the way, there are some cases where a link-down occurs through temporary failure of hardware, and immediately thereafter a link-up is established, thereby enabling data transfer to be resumed. When such an event occurs in the first half or in the middle of data transfer that transfers a series of packets, there is a possibility that packets in the second half including the final packet of the series of packets are delivered to the destination node although packets in the first half or in the middle of the data transfer are discarded. In this case, upon receipt of the final packet, the destination node will notify higher-level software of completion of data transfer. This practically causes corruption of data in the data transfer. Then, if the subsequent processing proceeds, the change in the content of a file system will be committed, for example. As a result, there is no reversion, and there is a possibility that an operation mistake having an influence on processing that follows would be committed.
In order to inhibit such packet losses in data transfer, a sequence as mentioned below has been performed to date. In the sequence, first, once a link-down occurs, the link-down is caused to be maintained. Then, when the external device monitor system detects a link-down error, the information processing system notifies operation management software of the occurrence of the link-down error, so that all applications that may use a point where the link-down has occurred are terminated because of the error. Then, the information processing system causes a link-up to occur again for the link-down point through the device monitor system. Thereafter, the information processing system executes again applications that use the point where the link-down has occurred.
Additionally, as a communication technology using packets, there is a conventional technology that assigns sequence numbers to packets and detects a packet loss by finding the omission of a sequence number. Examples of documents of the related art include Japanese Laid-open Patent Publication No. 2007-208635.