The invention relates generally to the field of digital computer systems, and more particularly to systems and methods for facilitating inter-process communications using message passing communication methodologies. The invention specifically provides systems and methods facilitating inter-process communications among processes using message passing methodologies, in a thread-safe manner.
Computers typically execute programs in one or more processes, each of which comprises one or more threads. Generally, a process will have an associated address space, which is separate and apart from address-spaces associated with other processes. Since each process has its own address space, the likelihood that a process""s program code, data and data structures associated with the process will be corrupted by processing operations in connection with other processes will be minimized. On the other hand, all of the threads in a respective process will share the same address space, which can lead to problems. For example, since all threads in a process share the same address space, each thread can access program code, data and data structures associated with other threads, and care needs to be taken to regulate such access.
In a number of applications, threads in various processes need to communicate to either obtain data from threads in other processes, or to transmit data to threads in other processes. To accommodate such communication, various communication methodologies have been developed. In one such communication methodology, known as xe2x80x9cmessage passing,xe2x80x9d a thread in one process can, as a source thread, transmit a message to a thread in another process, as a destination process, using messages containing the data to be transferred. One popular message passing mechanism, referred to as xe2x80x9cMPIxe2x80x9d (xe2x80x9cMessage Passing Interfacexe2x80x9d), provides a message passing arrangement to facilitate transfer of messages among threads in respective processes. Several MPI specifications define an interface that threads can use to make use of an MPI message passing arrangement.
A number of problems arise in connection with communication among threads in respective processes using message passing mechanisms such as those defined by the MPI specifications. Generally, to ensure that, when a xe2x80x9csourcexe2x80x9d thread in one process is to send a message to a xe2x80x9cdestinationxe2x80x9d thread in another process, each thread uses locking functions, such as the xe2x80x9cmutexxe2x80x9d (xe2x80x9cmutual exclusionxe2x80x9d) functions available in the Unix operating system, to protect the message passing operation and ensure that it operates in a xe2x80x9cthread-safexe2x80x9d manner. In particular, the source thread uses the mutex function to ensure that no other thread in its process attempts to send a message while it is engaged in performing the MPI calls required to initiate the message passing operation. In addition, the destination thread uses the mutex function to ensure that no other thread in its process will be attempting to receive a message while it (that is, the destination thread) is attempting to do so, which might result in the other thread erroneously receiving the message directed to the destination thread. However, serializing receive operations using mutexes in such a manner can cause deadlock problems since it prevents the other threads in the destination thread""s process from receiving and processing incoming messages.
Another problem arises in connection with collective operations such as, for example, broadcast, barrier synchronization, and reduction operations described in U.S. patent application Ser. No. 09/303,465, filed Apr. 30, 1999, in the name of Rolf H. vandeVaart, et al., entitled System And Method For Facilitating Communication Among A Plurality Of Processes In A Digital Computer System, (hereinafter, xe2x80x9cthe vandeVaart applicationxe2x80x9d) assigned to the assignee of the present application and incorporated by reference. Generally, in collective operations such as those described in the vandeVaart application, one thread in each of a plurality of processes will be engaged in the collective operation, and may be transmitting messages to threads in other processes and/or receiving messages from threads in other processes. In a collective operation, the individual messages transmitted between threads in the respective processes are typically xe2x80x9cpoint-to-pointxe2x80x9d messages, similar to the messages transmitted between threads in respective processes in a non-collective message passing operation. In a collective operation, a considerable amount of coordination is required as among the threads in the processes that are to be engaged in the collective operation. In addition, collective operations need to be given a higher priority than non-collective message passing operations, otherwise non-collective operations may prevent collective operations from completing.
The invention provides a new and improved system and method facilitating inter-process communications among processes using message passing methodologies, in a thread-safe manner.
In brief summary, the invention in one aspect provides a collective communications coordinating arrangement for coordinating a collective communications operation among user threads in a plurality of processes,the user threads being configured to communicate using a selected message passing methodology. The collective communications coordinating arrangement comprises a master thread and, associated with each of the processes, a respective slave thread. The slave thread, in response to a collective communications request from a respective user thread in its associated process, generates a collective communications request message for transmission to the master thread. The master thread, after receiving collective communications request messages from all of the slave threads associated with processes that contain threads that are to engage in the collective communications operation, generate a collective communications grant for transmission to the slave threads of all of the processes which contain threads which are to engage in the collective communications operation. In response to a collective communications grant from the master thread, the slave threads enable the respective user threads to engage in the collective communications operation.
In one embodiment, the collective communications grant includes two messages transmitted by the master thread to the slave threads. In response to the initial communication grant message, each slave thread acquires a message transmission regulation lock that regulates transmission of messages by threads in the process. In response to the second communication grant message, each slave thread transfers control to the user thread that is to engage in the collective communications operation.
In another aspect, the invention provides a communications coordinating arrangement for coordinating collective and non-collective communications operations among user threads in a process, the user threads being configured to communicate using a selected message passing methodology. The communications coordinating arrangement comprises associated with each the thread to engage in a non-collective communications operation, a non-collective communication operation control module and, associated with each the thread to engage in a collective communications operation, a collective communication operation control module. The non-collective communication operation control module, when the thread is to engage in a non-collective communication operation, initially performs a first lock operation to acquire a non-collective communication regulation lock which regulates transmission of messages in non-collective communications operations as among threads in the process and, after it has acquired the non-collective communication regulation lock, performs a second lock operation to acquire a general communication regulation lock. Each thread that is to engage in a non-collective communications operation is configured to not engage in a non-collective communication operation until after it has acquired both the non-collective communication regulation lock and the general communication regulation lock. The collective communication operation control module, when the thread is to engage in a collective communication operation, performs a lock operation to acquire the general communication regulation lock, and each thread that is to engage in a collective communications operation being configured to not engage in a non-collective communication operation until after it has acquired the general communication regulation lock. Since the locking sequence for collective communication operations is shorter than the locking sequence for non-collective communication operations, collective operations have somewhat higher priority than non-collective operations.
Yet another aspect provides a mechanism whereby a plurality of threads in a process can be in condition to receive messages contemporaneously and ensure that the thread correctly receives a message intended for it. In that aspect, the invention provides a communications coordinating arrangement for coordinating message receive operations among user threads in a process, the user threads being configured to communicate using a selected message passing methodology. The communications coordinating arrangement comprises, associated with each thread a probe control module and a message receive control module. The probe control module iteratively performs a locked message probe operation in which it initially acquires a message probe lock that regulates the locked message probe operation as among threads in the process, and thereafter determines whether a message is available for the respective thread. The message receive control module, if the message probe control module determines that a message is available for the respective thread, receives the message.