In the field of High Performance Computing (HPC), a number of computers (hereinafter, referred to as nodes) are connected to perform parallel computing. Topology choices for connecting nodes include mesh interconnect and torus interconnect. The mesh interconnect is a topology where nodes are arranged in a plurality of axial directions in a mesh and adjacent nodes in each axial direction are connected to each other with a high-speed interconnect network. The torus interconnect is a topology where nodes are interconnected in a mesh topology and then both end-nodes of each axis are connected to each other. There are also a network topology where all axes are used in a mesh interconnect or a torus interconnect and a topology where some axes are used in a mesh interconnect and the other axes are used in a torus interconnect.
To jobs that are executed in the HPC, nodes are assigned to execute the jobs. While executing a plurality of jobs, the nodes assigned to each of the jobs may perform inter-node communication via a common node. In this case, simultaneous data communication for a plurality of jobs that share a communication route causes interference in communication. If interference in communication occurs, the communication takes a longer time than expected. If such interference in communication occurs many times, the jobs may not be completed within an expected time period.
To deal with this, there is a technique by which only a group of nodes that are adjacent to one another on a network and that form a submesh or a subtorus (rectangular shape) is selected and nodes in the node group are assigned to a job. In this technique, each job needs a submesh or a subtorus, which avoids interference in inter-node communication between different jobs.
As another technique for assigning nodes to jobs, there is a job management apparatus, for example. This job management apparatus efficiently searches for idle nodes forming a consecutive rectangular or cuboid shape as compute nodes to be assigned to a plurality of unit jobs. There is also a technique for optimizing problem layout on a massively parallel supercomputer.
Please see, for example, International Publication Pamphlet No. WO 2012/020474 and Japanese National Publication of International Patent Application No. 2008-516346.
However, in the case where a region for nodes to be assigned to a job is limited to a submesh or subtorus, all nodes in the submesh or subtorus needs to be idle (i.e., any nodes do not execute any jobs) in order to be assigned to a job. In this case, the following problem may occur: Although there are as many idle nodes as requested for a job in the network as a whole, the nodes may not be assigned to the job because a sufficient-sized submesh or subtorus is not generated. That is to say, the node resources are not used efficiently.
Nodes may be used efficiently if a region for nodes to be assigned to a job is not limited to a submesh or subtorus and any idle nodes are selectable and assignable to the job. This case, however, may cause interference in communication between jobs. There have been no techniques for minimizing degradation of performance due to interference in communication between jobs without limiting the region for nodes to be assigned to a job to a submesh or subtorus.