1. Field of the Invention
The present invention, generally, relates to collective operations in data processing systems, and more specifically, to asynchronous collective operations in multimode data processing systems.
2. Background Art
Parallel computer applications often use message passing to communicate between processors. Message passing utilities such as the Message Passing Interface (MPI) support two types of communication: point-to-point and collective. In point-to-point messaging, a processor sends a message to another processor that is ready to receive it. In a collective communication operation, however, many processors participate together in the communication operation.
Collective communication operations play a very important role in high performance computing. In collective communication, data are redistributed cooperatively among a group of processes. Sometimes the redistribution is accompanied by various types of computation on the data and it is the results of the computation that are redistributed. MPI, which is the de facto message passing programming model standard, defines a set of collective communication interfaces, including MPI_BARRIER, MPI_EBCAST, MPI_REDUCE, MPI_ALLREDUCE, MPI_ALLGATHER, MPI_ALLTOALL etc. These are application level interfaces and are more generally referred to as APIs. In MPI, collective communications are carried out on communicators which define the participating processes and a unique communication context.
Functionally, each collective communication is equivalent to a sequence of point-to-point communications, for which MPI defines MPI_SEND, MPI_RECEIVE and MPI_WAIT interfaces (and variants). MPI collective communication operations are implemented with a layered approach in which the collective communication routines handle semantic requirements and translate the collective communication function call into a sequence of SFND/RECV/WAIT operations according to the algorithms used. The point-to-point communication protocol layer guarantees reliable communication.
Collective communication operations can be synchronous or asynchronous. In a synchronous collective operation all processors have to reach the collective before any data movement happens on the network. For example, all processors need to make the collective API or function call before any data movement happens on the network. Synchronous collectives also ensure that all processors are participating in one or more collective operations that can be determined locally. In an asynchronous collective operation, there are no such restrictions and processors can start sending data as soon as the processors reach the collective operation. With asynchronous collective operations, several collectives can be happening simultaneously at the same time.
Asynchronous one-sided collectives that do not involve participation of the intermediate and destination processors are critical for achieving good performance in a number of programming paradigms. For example, in an async one-sided broadcast, the root initiates the broadcast and all destination processors receive the broadcast message without any intermediate nodes forwarding the broadcast message to other nodes.