1. Technical Field
The disclosure and claims herein generally relate to multi-node computer systems, and more specifically relate to dynamic distribution of compute nodes with respect to I/O nodes on a multi-node computer system.
2. Background Art
Supercomputers and other multi-node computer systems continue to be developed to tackle sophisticated computing jobs. One type of multi-node computer system is a massively parallel computer system. A family of such massively parallel computers is being developed by International Business Machines Corporation (IBM) under the name Blue Gene. The Blue Gene/L system is a high density, scalable system in which the current maximum number of compute nodes is 65,536. The Blue Gene/L node consists of a single ASIC (application specific integrated circuit) with 2 CPUs and memory. The full computer is housed in 64 racks or cabinets with 32 node boards in each rack.
Computer systems such as Blue Gene have a large number of nodes, each with its own processor and local memory. The nodes are connected with several communication networks. One communication network connects the nodes in a logical tree network. In the logical tree network, the Nodes are connected to an input-output (I/O) node at the top of the tree.
In Blue Gene, there are 2 compute nodes per node card with 2 processors each. A node board holds 16 node cards and each rack holds 32 node boards. A node board has slots to hold 2 I/O cards that each have 2 I/O nodes. Thus, fully loaded node boards have 4 I/O nodes for 32 compute nodes. The nodes on two node boards can be configured in a virtual tree network that communicate with the I/O nodes. For two node boards there may be 8 I/O nodes that correspond to 64 compute nodes. If the I/O nodes slots are not fully populated, then there could be 2 I/O nodes for 64 compute nodes. Thus the distribution of I/O nodes to compute nodes may vary between 1/64 and 1/8. Thus, the I/O node to compute node ratios can be defined as 1/8, 1/32, 1/64 or 1/128 (IO/compute). In the prior art, the distribution of the I/O nodes is static once a block is configured.
The Blue Gene computer can be partitioned into multiple, independent blocks. Each block is used to run one job at a time. A block consists of a number of ‘processing sets’ (psets). Each pset has an I/O node and a group of compute nodes. The compute nodes run the user application, and the I/O nodes are used to access external files and networks.
With the communication networks as described above, applications or “jobs” loaded on nodes execute on a fixed I/O to compute node ratio. Without a way to dynamically distribute the I/O nodes to adjust the ratio of IO to compute nodes based on the I/O characteristics of work being performed on the system, multi-node computer systems will continue to suffer from reduced efficiency of the computer system.