A computer system such as a cluster computing system includes a number of computers (hereinafter called “nodes”) that simultaneously perform parallel processing. In such a computer system, collision in communications between nodes or congestion in communications may become bottlenecks.
In the related art technologies, a communication pattern is designed for application programs (hereinafter called “application”) executed by the nodes so as not to allow the collision to occur in communications between the nodes. Such a communication pattern is set in the applications, and each of the nodes performs communication according to the communication pattern set in the applications, which may prevent the congestion of the communications.
However, it maybe difficult to design an optimal communication pattern for those applications that change a communication partner node based on data input according to progress of the processing. This is because dynamic change of the communication partners may cause difficulty in predicting the communication pattern.
With respect to such applications, no specific actions have been taken in general in the hope of likelihood of collision occurrence being low owing to the randomness of the communication partners. Moreover, even if the collision did occur, it is generally considered that the throughput may be controlled by the bandwidth control and flow control of TCP/IP, or a frame collision avoidance technology for a local area network (LAN).
However, in the applications that handle a large amount of messaging communications with random communication partners, there is a likelihood that a certain amount of communications may temporarily converge on some of nodes in viewing the entire system. Accordingly, the likelihood of collision occurrence may not necessarily be low.
Further, the bandwidth control and flow control of TCP/IP may require longer time for exhibiting their effectiveness. In the frame collision avoidance technology for the LAN, packets may be lost at a LAN switch, or overhead may occur due to protocol stacks in the nodes and/or due to switching the control by the LAN switch.
As a result, communication congestion occurs in part of the network, which inhibits the potential performance that the entire system may otherwise exert.