Parallel supercomputing clusters are used for numerical simulation of complex systems and processes. For example, parallel supercomputing clusters are used for weather forecasting, climate change modeling, and scientific research. Compute nodes perform similar computations of the simulation in parallel and exchange intermediate results over a mesh of high-speed interconnections using a Message Passing Interface (MPI) protocol. The Message Passing Interface is a standardized and portable message passing system designed to function on a wide variety of parallel computers.
Because the time to complete a numerical simulation is often much larger than the mean time to interrupt (MTTI) of a compute node due to a software or hardware error, a checkpoint technique is used to recover from such an interrupt. The checkpoint technique periodically saves checkpoint data representing the computational state of the compute nodes, so that the supercomputing cluster may recover from the interrupt by re-starting execution from a checkpoint saved in data storage.
Very large parallel supercomputing clusters have been assembled from rack-mounted commodity servers and switches. The rack-mounted commodity servers for the supercomputing cluster are configured to function either as compute nodes that perform the calculations of the simulation, or input-output (I/O) nodes that save the checkpoint data in a magnetic disk storage array. One I/O node is sufficient for storing all the checkpoint data from one group of about eight to sixteen compute nodes. The intermediate results and checkpoint data are exchanged between the nodes using high-speed InfiBand (TB) data links and switches. A slower speed Gigabit Ethernet network links each of the nodes to a control station computer providing an administrative node for the supercomputer. Such a supercomputer is described in Komernicki et al., “Roadrunner Hardware and Software Overview,” IBM Redbook REDP-4477-00, January 2009, IBM Corporation, Armonk, N.Y.