1. Field of the Invention
The present invention relates to a system and method for executing an MPI in a grid computing system, and more particularly to file-based grid MPI job allocation system and method for a grid computing system in which computers are distributed and connected to each other through a network as computational nodes, wherein the grid MPI job allocation system differentiates functions of a middleware and the MPI program, thereby achieving MPI initialization without intervention of a separate arbitration process.
2. Description of the Prior Art
As generally known in the art, grid computing is a technology for effectively constructing a high-performance infra environment by integrating various kinds of computing resources connected through a network. A grid computing environment constructed based on such a technology as described above shows difference from the current internet environment in many aspects. The grid computing environment allows sharing of various kinds of computing resources as well as sharing of the simple information basically provided by the internet environment. In the actual grid environment, it is natural that a user can simultaneously use various available resources.
Currently, researches for the grid computing are being actively conducted worldwide and tools for supporting the grid environment based on a configuration called ‘Open Grid Services Architecture (OGSA)’ are being developed with reference to existing web service models. The term ‘OGSA’ indicates a specification for a configuration of grid services which can be linked with each other. The OGSA has been achieved by revising the existing web service models while mainly focusing on the characteristics required by the grid construction and applications and is now recognized as a new configuration of a middleware for grid computing. The application of the grid computing has used up to now the Globus toolkit which is a standard middleware in the art and has been developed up to the Globus toolkit version 3.0 based on the OGSA.
In a framework based on the OGSA as described above, all functional elements are expressed as grid services, and a state of each of the grid services is expressed by a standardized method, i.e., by a service data.
The Message Passing Interface (MPI) is a standard interface, which enables application scientists to execute a parallel program in a high-performance computer and is a parallel-processing library based on message transfer technique. All processes participating in an MPI parallel program can perform particular programs by exchanging messages with each other with their own IDs (or ranks). Therefore, each of the processes must first understand its own role (rank), the entire configuration, and a location of a counterpart in the entire program. This job is performed by MPI_Init function, and the MPI application can be performed only after the MPI_Init process. Current scientists are expected to prefer to do the job using the existing MPI code already made rather than to make an application program by means of a new interface in a grid environment.
Meanwhile, currently known Application Program Interfaces (APIs) for performing jobs using an MPI code in a grid environment include MPICH-G2 and MPICH-P4. Both the MPICH-G2 and the MPICH-P4 have a central type initialization scheme, in which an intermediary process is located at the center and must continuously perform the intermediation in the exchange of information between processes. Therefore, when the location of the intermediary process has a large influence on the performance, the reliability on the middleware may become very large according to the types of the intermediary processes.
That is to say, in order to enable the MPI communication in another cluster environment, the existing Globus toolkit 2.x uses an API named DUROC (Dynamically-Updated Request Online coallocator). When the MPI job is initialized using the DUROC API as the MPICH-G2 as an example of materialized MPIs available in the Globus toolkit 2.x, the MPI allocation for resources is performed through the DUROC API and the MPI processes in each resource exchanges information required for the initialization.
More specifically, as shown in FIG. 1, the MPICH-G2 is an enlarged version of the MPICH for the grid, which has been made by the Argonne National Laboratory (ANL). The MPICH-G2 utilizes the functions of the Globus toolkit in all the steps in the process of executing the MPI, such as disclosure of the Globus toolkit job, communication, etc.
In other words, the MPI initialization in the MPICH-G2 is a centralized initialization using the DUROC. Referring to FIG. 1, the DUROC is contained in the programs named ‘globusrun’ and ‘globus-job-manager’. The golobusrun manages works of computers distributed in the network and connected to the center through the network and helps message transmission between the computers in the MPI initialization. Therefore, the initialization of the process has a large reliability on the component of DUROC, and the centralized initialization forces the center to understand all necessary information of the network for the management.
Further, when using the MPICH-P4 which is a basic library of the MPICH, a library performing the MPI function through a communication module of P4, the user uses the mpicc for the compiling and uses the mpirun program for the start. In the MPICH-P4, a single repetitive process is generated for initialization and is used to produce other processes. When the mpirun is called, the mpirun generates an environment parameter file named ‘PIXXX’ and generates a signal a.out process. Herein, although the a.out is an execution program compiled by the user, the a.out executed first is also used as a start program for generating another process. The PIXXX file enables the a.out to understand the location at which another process must be positioned, and the a.out generates the location by means of rsh. When the process is generated using the rsh, the rank of each process and the address of the master are simultaneously given, and the slave nodes exchanges their information by communicating with a master node.
In the MPICH-P4 as described above, the master process of the user perform the arbitration, while the computing processes communicate with the master process in order to understand positional information of other processes. Therefore, the MPICH-P4 inevitably has reliability on the middleware because it is a centralized type, although the reliability is low.
The MPI job allocation system and method such as the MPICH-G2 and MPICH-P4, in which communication is performed based on an arbitration process such as DUROC API, are dependent system and method, in which the API corresponding to the arbitration process must be remade when the grid middleware changes. Accordingly, whenever the standard for the middleware changes, it is indispensable to develop again a new API corresponding to a new arbitration process.
Therefore, there has been a strong request for MPI job allocation system and method capable of executing the MPI in a multiple cluster environment, which is an actual grid environment, independently from a middleware, by employing a different scheme from the MPICH-G2 or MPICH-P4, even without the DUROC API which enables communication with an MPI process in another resource, such as Globus toolkit 3.0 which is a current standard in the art.