Message processing systems, for example, the multiprocessor data processing system 10 depicted in FIG. 1, require reliable message communication paths between respective ones of the processors 12.sub.1 . . . 12.sub.j. The exemplary system 10 of FIG. 1 employs an exemplary communication medium or switch network 20 commonly coupled to the processors 12. The processors may require respective communication adapters 14.sub.1 . . . 14.sub.j to control communications between each processor 12 and the medium 20 via respective connections 16.sub.1 . . . 16.sub.j. Communication between, for example, software application(s) executing on the processors 12 of system 10 can thus be provided via medium 20. Storage medium 22 may be employed in the system to hold the applications, associated data, etc.
Because respective processors may be supporting different, but related application software partitions, messaging must be used as a form of communication between the processors. For example, messages may require transmission from a "source" node (e.g., processor 12.sub.1) to a "destination" node (e.g., processor 12.sub.j).
The asynchronous nature of the application software partitions on the source and destination nodes often results in a condition where the number of messages sent from a source node exceed the destination node's ability to handle them. Normally, the destination node is expected to post buffers to hold incoming messages. The messages can then be retrieved from the buffers and appropriately processed by the application software. This is illustrated in FIG. 2, which is a hybrid hardware/software diagram of a message processing system like that of FIG. 1 and which depicts a message source node 18.sub.1 and a message destination node 18.sub.j. (The term "node" is used broadly herein to connote any identifiable combination of hardware and/or software to or from which messages are passed.) Source node 18.sub.1 has allocated therein send message buffers 30 within which are placed messages M(1), M(2) and M(3) which, for application reasons, are required to be sent through send message processing 32, across medium 20, to destination node 18.sub.j.
Destination node 18.sub.j, in anticipation of the arrival of messages from various sources in the system, can allocate or post receive buffers 40. In the example of FIG. 2, buffer B1 holds the first arriving message M(1), buffer B2 holds the second arriving message M(2) and buffer B3 holds the third arriving message M(3). Received message processing 42 then removes messages from their buffers and can then pass the messages to receive processing 44 (e.g., the application software partition executing at the destination node).
Those skilled in the art will understand that message ordering in a system can be imposed by using a particular protocol, e.g., messages sent from a particular source to a particular destination may be sequentially identified and the sequential indicia can be transmitted as control information along with the data portions of the messages.
The process of allocating or posting receive buffers 40 in destination node 18.sub.j is often a dynamic one, and if more messages are arriving than there are buffers posted, buffer overrun can occur. Traditional solutions to avoid buffer overrun at the destination node include 1) data buffering with a pre-reservation protocol or, 2) adopting a convention wherein the destination node automatically discards packets assuming that the source node will retransmit them after a time-out. The first solution assumes a destination node that is frequently unprepared to accommodate data, and the second solution assumes a destination that is rarely unprepared to accommodate data.
A problem with the first solution occurs when message size is practically unbounded, or if the number of message sources is large. Large messages can be decomposed into smaller pieces and flow controlled into the buffers, if the overhead to do so is manageable. However, many sources present problems with buffer fragmentation or starvation. Distributed fairness protocols can be introduced to solve these problems, but at a price in complexity and additional overhead.
A problem with the time-out/retransmit solution is that should the destination be unable to accommodate the data for an extended period of time, many needless retransmits will occur, occupying otherwise useful bandwidth on the medium.
A third conventional solution to this problem is a rendezvous protocol. A rendezvous protocol involves the transmission from the source node of a control information packet relating to a message to be sent from the source node to the destination node. The control information may include an indication of the length of the entire data portion of the message to be sent, as well as indicia which identifies the message and/or its sequence. When a buffer of adequate length is allocated or posted at the destination node, an acknowledgment packet transmission is sent from the destination node to the source node, and the source node can thereafter reliably send the entire message to the destination node. This technique also makes conservative assumptions about the preparedness of the destination node to accommodate the data portion of the message. In conventional rendezvous protocols, the initial exchange of the control information and acknowledgment packets results in a loss of performance because two packets are now required to be exchanged between the source and destination nodes before any actual message data can be exchanged.
What is required, therefore, is a method, system, and associated program code and data structures, which prevent the performance degradation associated with packet retransmission after time-outs, or with standard rendezvous protocols in which an exchange of packets between source and destination nodes occurs before any actual message data is exchanged.