A large-scale computation, such as a scientific computation, that uses a computer system sometimes involves a parallel computation through the use of a plurality of computers. A computer system that can perform a parallel computation is known as a parallel computer. Each of a plurality of computers that perform parallel computations is called a computation node device.
A reduction operation, which treats data belonging to a plurality of processes, is performed between a plurality of computation node devices that are performing parallel computations. Examples of a reduction operation include an operation of obtaining the sum of data, an operation of obtaining the maximum and minimum values of data, and others.
A barrier synchronization device is known, in which the provision of a synchronization unit that synchronizes a plurality of sets of signals accelerates the barrier synchronization for a plurality of nodes that perform parallel operations (see for example Japanese Laid-open Patent Publication No. 2010-122848).
A technique is also known, in which an intermediate node transfers a cut-through data packet, enabling the transmission of the cut-through data packet to start before performing a frame CRC on the packet (see for example Japanese National Publication of International Patent Application No. 2013-513269).
A reduction operation performed in a computation node device involves a process in which that computation node device receives a packet from a different computation node device, performs an error check on the packet by using the checksum included in the packet, and performs the reduction operation by using the data in the packet when finding no error.
An error check is difficult for a computation node device to perform before completely receiving a packet up to the end of the packet. A packet larger in size leads to a longer time between the start and the completion of the reception of the packet, which also elongates the time before the completion of the error check. As described above, a reduction operation is performed after the completion of the error check. Thus, a packet larger in size elongates a waiting time between the start of the reception of the packet and the start of the reduction operation.