1. Field of the Invention
The present invention is related to large scale computer processing, and more specifically distributed computing using libraries such as Message Passing Interface (MPI) Distributed libraries.
2. Description of Related Art
In large-scale distributed computer systems, such as those using an MPI library, to perform input and output operations, file accesses are performed by input/output (I/O) agents. An MPI job is executed by a number of tasks. The tasks are individual processes that carry out the operations required by the MPI job. Traditionally, every task is an I/O agent. The library assigns I/O operations to I/O agents in a round robin fashion, with each I/O agent handling a different portion of the file the I/O operations are accessing.
Assigning every task as an I/O agent works effectively when an MPI job has a small number of tasks. As the number of tasks in an MPI job increases, the performance deteriorates because of the overhead required by the library to coordinate I/O operations. Performance may also deteriorate because there may be many I/O agents performing I/O operations on the same compute node(s) of a system at the same time and these I/O agents will compete for the same set of computer resources. For example, if I/O agents executing on a particular compute node are simultaneously assigned to handle I/O operations on corresponding portions of a large file, such as a database, these I/O agents will compete with each other for processor execution resources, access to the network disk control unit, and other computer resources. Further, for I/O operations involving large files, wait states can occur in the MPI job. If many or all of the tasks in an MPI job are performing I/O operations on a large file, the tasks may not be able to perform other operations required by the MPI job, until the tasks finish their I/O operations. In dynamic MPI jobs, i.e., those involving the instantiation of one or more additional groups of tasks, referred to as “worlds,” after the MPI job has begun, fixed I/O agent assignment is not practical since the number of tasks can increase or decrease after the job has begun, and therefore an optimal allocation of I/O agents cannot be predetermined.
Therefore, it would be desirable to provide a method and system to selectively assign tasks in an MPI job as I/O agents to improve performance. Further, it would be desirable to provide a method for selectively assigning tasks as I/O agents in a dynamic MPI job in order to perform I/O operations using tasks that did not exist at the initialization of the job, and for performing I/O operations for worlds that did not exist at the initialization of the job.