In a multiprocessor communication system, messages are sent over data links between processors and certain protocols have been established to specify the procedure for exchanging information between two ends of a single communication link. While these protocols provide sufficient protection against transmission errors for a single link connection, problems still exist in a multilink environment.
For example, messages can get lost along a communication path made up of a plurality of serially connected links in places such as message buffers, queues, etc., associated with the intermediate and end processors.
Additionally, when duplicate parallel links are provided for reliability, portions of a sequence of messages associated with a single process may be sent via different paths to the same destination. In many instances, these messages must be acted upon in the same sequence in which they were transmitted; however, due to the multiple paths available, the messages may not arrive in the same order in which they were transmitted and some messages may even be lost.
Thus, a need exists for detecting lost or out-of-sequence messages in a multiprocessor enviroment.
The problem of lost or out-of-sequence messages has been considered in the past. In one known arrangement, for example, each message is assigned a sequence number by the sending procssor. The receiving processor then keeps track of the sequence numbers. It may acknowledge the receipt of each message and in some instances request retransmission of any lost or garbled messages.
While the above arrangement is suitable for its intended purpose, whenever the retransmission of lost messages is provided, additional storage is required since the sending processor must keep a record of all messages that are sent until each message is acknowledged by the receiver. Also, in a large multiprocessor system, the average message interarrival time between two end processors may be lengthy during busy periods. The receiving processor may erroneously interpret the length of time between messages as a lost message and prematurely ask for retransmission, thus unnecessarily increasing the message flow between processors.