In a parallel computer system, as a method of setting out asynchronization between a plurality of processes that are processed in parallel between a plurality of nodes, a barrier synchronization has been proposed. A point for setting out the synchronization, in this instance, a barrier point is set in accordance with a progress phase (stage) of the processing in the process in the barrier synchronization, and in a case where the processing in the process arrives at the barrier point, the process for carrying out the barrier synchronization waits for the progress of the processing in the process in the other nodes by temporarily stopping its own processing in the process. The process for carrying out the barrier synchronization resumes the stopped processing by finishing the waiting state at a time point when all the processes that are processed in parallel for carrying out the barrier synchronization arrive at the barrier point. According to this, between the plurality of processes that are processed in parallel between the plurality of nodes, it is possible to set out the synchronization in the parallel processing.
In a barrier synchronization apparatus, when the barrier synchronization is executed, depending on an algorithm, the process needs to change a transmission destination of a signal or a message indicating the arrival at the barrier point (barrier synchronization message) for each of the stages. In view of the above, a barrier synchronization apparatus that realizes the transmission destination change processing in the barrier synchronization by using hardware has been proposed. According to this barrier synchronization apparatus, an intermediation of a CPU (central processing unit) for each of the stages is eliminated, and a higher speed of the barrier synchronization can be realized. Furthermore, in this barrier synchronization apparatus, a synchronization unit for setting out a synchronization of plural sets of signals or messages is provided. According to this, while a configuration of a network between the plurality of nodes is not limited in a case where the nodes are connected by the network, it is possible to execute the barrier synchronization at a high speed.
It should be noted that the following configuration has been proposed. An intra-node barrier synchronization mechanism detects that the barrier synchronization in its own apparatus is established on the basis of a synchronization request from the CPU provided in its own apparatus and also notifies all the node apparatuses that executes the parallel processing of the information on the establishment of the barrier synchronization in its own apparatus. An inter-node barrier synchronization mechanism detects that the parallel processing is completed on the basis of the information on the establishment of the barrier synchronization in the other apparatus which is notified from the other node apparatus that executes this parallel processing. While a complication of the barrier synchronization mechanism is not caused and also a special communication mechanism is not provided, by transmitting and receiving the information on the establishment of the barrier synchronization in its own apparatus, the completion of the parallel processing is detected.
In a parallel computer system, a global clock is used for a time synchronization between the plurality of nodes included in the entire system. To realize the global clock establishing the synchronization between the plurality of nodes, it is conceivable to use the barrier synchronization apparatus. That is, it is conceivable to realize the global clock establishing the synchronization between the respective nodes that are the barrier synchronization apparatuses by using the barrier synchronization apparatus based on a butterfly algorithm in which the high speed of the barrier synchronization is realized without the intermediation of the CPU for each of the stages. However, in the barrier synchronization based on the butterfly algorithm in which the high speed of the barrier synchronization is realized without the intermediation of the CPU for each of the stages, because of a fluctuation in an arrival timing of the synchronization messages from the respective processes, the establishment of the synchronization between the plurality of nodes fluctuates. Because of this fluctuation in the establishment of the synchronization between the plurality of nodes, a phase difference is generated between the global clocks of the respective nodes in the global clock. For this reason, in a case where the global clock is realized by using the barrier synchronization apparatus based on the butterfly algorithm in which the high speed of the barrier synchronization is realized without the intermediation of the CPU for each of the stages, it is necessary to reduce the fluctuation in the establishment of the synchronization between the plurality of nodes.
The present invention provides a parallel computer system in which the fluctuation in the establishment of the synchronization between the plurality of nodes is reduced in the barrier synchronization.
Related-art techniques related to a parallel computer system, a synchronization apparatus, and a control method for the parallel computer system are disclosed as follows.    [Patent Document 1] Japanese Laid-open Patent Publication No. 2010-122848    [Patent Document 2] Japanese Laid-open Patent Publication No. 2001-051966