In a computer system environment, an application program interface (hereinafter an API) is a functional interface supplied by the operating system that allows an application program written in a high-level language to use specific data or functions of the operating system. In some instances an API acts as the interface through which an application program interacts with an access method. For example in VTAM programs, an API functions as a language structure used in control blocks so that application programs can interface the control blocks and be identified to VTAM. In addition, in a multi-tasking operating system, application program requests are made to the operating system through the API so that a request automatically starts a task or a process to be completed.
The function of an API is even more important in a parallel processing environment. In parallel processing environments, the computer architecture uses many interconnected processors to access large amounts of data in order to simultaneously process a large number of tasks at high speeds. In such environments, the multi-tasking operating system relies heavily on the API for timely task processing.
To perform the required task processing in a timely manner, many API's define a set of collective operations for performing complex communications between the groups of processes. One such API used in distributed parallel processing is the Message Passing Interface (hereinafter, MPI) standard that uses and defines many collective operations. Some examples of such collective operations include certain functions like "broadcast", "reduce", "scatter", "gather", "all-to-all", and "barrier". MPI is currently being adopted by many manufacturers of parallel processors.
There are several advantages associated with collective operations, among which are ease of use and performance. Collective operations make creating complex communication patterns easy by encapsulating multi-stage communication algorithms in a single subroutine call, and they allow optimization for specific hardware platforms by leaving the choice of implementation of algorithms to the API developer.
There is, however, one major disadvantage with the use of collective operations. In most instances, many of these collective operations are synchronous, and they "block" the processor until it is time for the performance of the task. In other words, if a collective operation is invoked by one parallel task, that task must wait until all other tasks have invoked the operation before continuing. So synchronous collective operations force faster tasks to waste time waiting for slower ones, which can create serious performance problems for some applications.
The solution for applications whose tasks are not inherently synchronized is to use "non-blocking" or asynchronous collective operations. Non-blocking operations allow each task to proceed at its own pace, and could periodically be tested for completion or "waited" for if necessary. However, MPI and other well known message passing API's do not define such a set of asynchronous collective operations. The main reason that so many message passing API's do not define asychnchronous collective operations is that many collective operations use multi-stage algorithms in which the output from one stage is the input to the next stage of operation. Therefore, the sends and receives of the next stage (N+1th stage) cannot be posted until the current stage (Nth stage) has been completed.
Therefore, it is highly desirable to design a method of asynchronous distributed collective operations that allows mixing of blocking and non-blocking operations between tasks as to improve the performance and efficiency of the environment while still maintaining the capability of handling multi-stage operations.