The present invention is generally directed to systems and methods for transferring messages from one autonomous data processing unit (node) to another such unit across a network. More particularly, the present invention is directed to systems and methods for message transfer in an efficient and reliable fashion without the need for the creation of extraneous message copies through a switched network in a manner that effectively handles bad paths and problems associated with message packet ordering and synchronization. Even more particularly, the present invention is directed to a communications adapter that is provided between an autonomous data processing unit and a switched network. Even more particularly, in accordance with another aspect of the present invention, a system and method are provided in which various hardware tasks associated with a specific channel are provided with a mechanism for communicating with one another in a direct memory to memory fashion. In yet another aspect of the present invention, the communication adapters are provided with mechanisms for time of day synchronization and with related mechanisms that establish backup and master/slave relationships amongst a plurality of adapters that permit designated backup adapter units to take over the communications operations of a failed adapter unit.
It is first of all desirable to place the present invention in its proper context and to indicate that it is not directed to the transfer of information within a single data processing unit. This can be likened to talking to someone in the same room. Instead the present invention is directed to the transfer of information in the form of messages or message packets through a switched network having a plurality of possible information flow paths. This can be likened to a lengthy conversation between individuals on different continents.
When information is transmitted through a switched network in the form of message packets there are many problems that can arise. First of all, it is possible that one of many message packets fails to arrive. Or, if it does arrive, an “acknowledgment of receipt” message may not make its way back to the sender, which points out the fact that this communication modality is such that a return signal acknowledging receipt is a very desirable part of the message passing protocol. Secondly, even if the message packet does arrive, it may not arrive in a desired sequence with respect to other related packets. Thirdly, there are typically many paths that a message packet may take through a switched network. The reliability of these paths is subject to change over time. Accordingly, systems for message packet transfer should take bad paths into account by identifying and tracking them as they arise.
One of the very desirable attributes of a message passing system is to have various hardware tasks associated with a specific channel to communicate with each other. However, one of the specific problems that can occur in message passing systems such as those employing communication adapters occurs when there are several tasks associated with a specific channel, and one of these tasks is copying a key control block from external memory into some local memory. In this circumstance, the other tasks need to be told to wait for this control block to get to the local memory.
One of the ways for solving this problem is via the creation of a semaphore for every potential action for every channel that is supported by the adapter. When a task wants to perform this action for a specific channel, it locks this semaphore, blocking all other tasks from performing this action to this channel. When the action has completed, the task can then leave a specific indicator (an “encode”) in the semaphore, indicating to all other interested tasks that this particular action has completed. There are, however, several problems with this approach. For example, an adapter support thousands of channels or an adapter may have a large number of actions that it wants to perform on that channel (such as copying in a key control block into local memory). In this regard it is noted that locking and unlocking semaphores is usually a slow process because of the communication coordination and overhead required.