Parallel processing apparatuses are used, which divide a task with a large amount of computation into a plurality of subtasks with a small amount of computation and execute the subtasks by using in parallel a plurality of computers (may be called “computing nodes,” or simply “nodes”) connected to a network. In such a parallel processing apparatus, nodes may communicate with each other during execution of subtasks. Therefore, in creating an application program for the parallel processing apparatus, a communication library, such as a Message Passing Interface (MPI) library, may be used. The communication library eliminates the need for a user to define the detailed procedure for the inter-node communication in the application program.
There has been proposed a distributed-memory parallel computing system in which each node sends data to all nodes except itself. In this proposed parallel computing system, a plurality of nodes are able to perform mutual communication in 2n phases. Each node performs exclusive OR operation on an identification number given to the own node and a phase number, and selects another node whose identification number matches the resulting value of the exclusive OR operation, as a communication partner.
Please see, for example, Japanese Laid-open Patent Publication No. 11-110362.
If there is a possibility that a certain node communicates with a plurality of other nodes, the certain node may prepare individual receive buffers corresponding one-to-one to the other nodes in a memory thereof, in order to improve the efficiency of the inter-node communication. However, the preparation of the individual receive buffers for all the other nodes increases the memory usage, and this is a problem.