Embodiments disclosed herein relate to techniques for loading programs for a multiple program multiple data (MPMD) job on a parallel computing system.
The MPMD programming model for High Performance Computing (HPC) allows multiple programs to run in the same job across multiple tasks. As used herein, a “task” is a process, or multiple processes, running on a compute node of a parallel computing system. A collection of such tasks for performing a computation is referred to herein as a “job.” For example, a weather job may include separate programs, each running across a number of tasks, simulating the atmosphere, ocean currents, and radiative flux from the sun, etc. These programs may communicate via, e.g., Message Passing Interface (MPI), to coordinate the atmospheric, ocean current, and radiative flux simulations performed during the job.
When MPMD jobs are started, operating systems on each compute node of the parallel computing system load relevant programs for tasks to run on the node. Typically, a large number (e.g., millions) of tasks across multiple compute nodes participate in a MPMD job, even while the job may only include a few (e.g., 10) unique programs. At load time, each compute node makes a request to a file system to load the programs required for the job. Such simultaneous attempts to load the same few programs are difficult for file systems to handle and thus affect performance. The replicated transmission of the same program data across the network from the file system to multiple nodes also affects performance.