The invention relates to parallel processing systems, and in particular to a protocol for interprogram communication in such systems.
In parallel computing, it is typical for communication to be provided among the processes, or xe2x80x9ctasksxe2x80x9d, of a parallel application program, or xe2x80x9cjobxe2x80x9d. Usually, such communication requires the setting up of some information in the communication system software. For example, routing tables often are required to provide a translation between a logical task number within the job (typically numbered 0 to nxe2x88x921), and a physical port or address within the parallel computer. Such information is usually partially set up when the computer is initialized, with the remainder of the information being set up when the job is initialized.
There is a need to provide communication, not only among the tasks of a parallel job, but also from the tasks of the parallel job to tasks outside the job. Such communication could occur, for example, between two separately initiated programs running at the same time, and wishing to exchange data between themselves. Such communication could also occur between a parallel application program (a parallel xe2x80x9cclientxe2x80x9d), and another parallel program which provides a service within the parallel computer (a parallel xe2x80x9cserverxe2x80x9d).
There are several problems that arise in trying to facilitate such communication between a parallel application program communicating with another parallel program providing a service within the parallel computer. The typical approach has been to xe2x80x9chard codexe2x80x9d (i.e., to set and make known a priori) the addresses of the processes of one program, and to make these addresses known system wide. Another program wishing to communicate with the first program simply sends messages to the known addresses. There are several problems with this approach:
1) It does not provide any dynamic information about running programs. If a server program is not currently executing for some reason, there is no way for a client program to become aware of this, as it has only the hard coded address information about the server.
2) There is no control of transient information about the client and the server, for example, if some of the nodes of the server are aware of the client while others are not.
3) There is no way for a server to have complete information about a client, as it receives messages from individual tasks of the client without first receiving notification that such messages are to be sent, or what the full extent of the client nodes of a particular application are. If a server task receives a message from a task of a previously unknown client program, that server task will not be aware of which other tasks are also members of the client program, the number of such tasks, or what their physical addresses are. Also, the other tasks of the server will not know anything about any of the client tasks. Such information is usually required for proper management of server resources.
The problem is even more difficult in the case where two application programs wish to communicate. In this case, neither program is run on a predetermined or fixed set of nodes. The set of nodes on which a program is run is usually determined when the program is initiated. There is no simple way to hard-code the addresses of one program""s tasks in a way that the other program""s tasks can discover them.
Furthermore, in the case of either client-server or interprogram communication, there is no way to ensure that one program is made aware of the termination of the other program. Potentially, this could leave one program blocked (i.e., unable to proceed until it receives a response) while trying to communicate with the other program.
Thus, there is a clear need for a flexible protocol for establishing communication between two parallel programs.
This invention provides a method for establishing a communication connection between two parallel programs, each running on multiple processors of a distributed memory parallel computer, such as an IBM SP2, or on multiple computers in a cluster of workstations or a set of network connected workstations. Such a method is necessary in an environment where communication among processes of separate parallel programs is desired, and where some control, such as authentication of the program with which communication is desired and authorization to connect to that program, is also desired.
One application of this method is to provide communication sessions between parallel clients and parallel servers in parallel computers, for example, to link parallel application programs to a parallel file system.
Thus, the invention is a family of protocols that constitute a method for establishing a connection or communication session between two programs, each having one or more tasks and running on a plurality of processors, wishing to communicate with each other. The protocol family includes all asymmetrical protocols for establishing client-server connections between parallel clients and parallel servers, parallel clients and serial servers, and serial clients and parallel servers. The protocol family also includes asymmetrical protocols for establishing connections or communication sessions between two peer parallel programs that wish to intercommunicate. Protocols that establish communication between one parallel and one serial program are also included. The invention includes all protocols that require one of the two programs that wish to communicate to actively initiate the communication session, while the other program passively accepts such communication session initiations.
A first element of the invention is that there is a basic asymmetry to the protocol, with one program actively requesting the connection, and the other program passively accepting or rejecting it.
A second element of the invention is that no task of the active program of the connection will attempt to send messages or otherwise communicate with the tasks of the passive program that accepts the connection until after it has been notified that all the passive program tasks are prepared to receive messages from any task of the active program, and also that all other active program tasks are prepared to receive messages from any passive program task.
A third element of the invention is that no task of the passive program will attempt to send messages to any task of the active program until it is certain that all the tasks of the active program are prepared to receive messages from any task of the passive program, and also that all other passive program tasks are prepared to receive messages from any active program task.
A fourth element of the invention is that it allows that the tasks of the passive program be free running during the entire connection establishment. The tasks of the active program are free to run for the duration of the connection establishment as long as they do not attempt to communicate with the passive program until they are informed that the connection has been established.
A fifth element of the invention is that the connection protocol is mediated through a secondary, indirect communication channel between the tasks of the active program and the tasks of the passive program.