The present invention is directed to a system for support of packet transmission in processor-based networks using serial interconnects. In particular, the system of the invention provides dynamic ordering support for packets in response to errors in such networks, in particular in ringlet topologies, accommodating both busy conditions at nodes on the network and nodes that fail to respond or respond for long periods of time with busy acknowledgments, due either to overload or node failure.
Serial interconnects in computer systems are subject to a number of different types of service interruptions. For example, when a node on a network encounters a CRC (cyclical redundantcy check) error in a packet, that packet cannot be accepted. The node that sent the packet learns, generally indirectly (such as by a timeout or through the use of idle packets) of the error, and eventually must resend the packet.
Resending the packet may not be a simple matter, especially if the network implements an ordering scheme, such as relaxed memory ordering (RMO), strong sequential ordering (SSO), or orderings of other or intermediate stringency. In a packet-switched network with such an ordering scheme, packets preceding and following a packet giving rise to a CRC error may need to be resent by the producer node to the target node.
A particular problem arises when such a packet contains a nonidempotent command, i.e. a command which, once it is executed at the target node, changes the state of that node, such that reexecution of the command at that node would yield different results from the first execution; in this case, if the command is resent to the node and executed again, undesired or unforeseen results are likely to take place.
Thus, a system is needed wherein errors in packets can be accommodated by resending the packets to the target node, while maintaining support for idempotent commands. In particular, such a system is needed that also supports various levels of ordering schemes.
A particular need is present for such a system that can accommodate ill-behaved nodes, i.e. nodes that fail or take unacceptably long to reply. In a topology where a single receiver is poorly behaved, this node can impede forward progress of the new packets by forcing the repeated sending of the same busy packet to complete before new packets are introduced. Thus, a system is needed that deals with this potential pitfall.
A system is presented which provides for resending of packets that have resulted in CRC errors at the same time as dealing with busy acks, by maintaining a state at each receive node of all known good packets. When packets need to be resent, the local nodes know whether they have processed the resent packets before, and know which the last known good packet was, and in this way are able to avoid reprocessing already executed commands, including nonidempotent commands. A busy loop can be effectively suspended while the error loop is executed, and once a reordering of the packets takes place, the busy loop can be completed. In addition, the system provides for an ill-behaved, e.g. nonresponding node, which does not send any response back to the producer node. Such an ill-behaved node can be effectively neutralized by a queued retry policy, wherein busy packets are accumulated in a retry packet queue, until some predetermined threshold is reached, after which the continually busy node is effectively removed from the system by the producer node ending its attempts to retry the packets to that node. In this way, the system can proceed with other outstanding packets.
This application thus relates to a complex system achieving the resolution of three simultaneously occurring problems on a ringlet network: error conditions, busy conditions at one or more nodes, and the failure or overloading of a node. A suitable error retry mechanism is described both herein and in applicant""s copending patent application filed Jul. 1, 1996, entitled System for Dynamic Ordering Support in a Ringlet Serial Interconnect by van Loo et al. A system based upon such an error retry mechanism and at the same time able to handle busy retry operations is described in applicant""s copending patent application filed Jul. 1, 1996, entitled System for Preserving Sequential Ordering and Supporting Idempotent Commands in a Ring Network with Busy Nodes by van Loo et al. Each of these patent applications is incorporated herein by reference.