Electronic systems typically rely on buses to transfer data between components. A bus is a signal route to which system components are connected in parallel so that signals can be passed between them. Although buses are relatively convenient from an implementation standpoint, the bus paradigm has a number of drawbacks. First, because buses connect multiple components in parallel, much time must be spent arbitrating between different components wishing to access the bus. Second, traditional bus systems typically do not allow a user to add or remove a component to/from the system while the system is operating, due to the fact that all of the components on the bus are connected electrically to each other in parallel.
A recent industry trend has been to move away from the traditional bus method of intra-system communication/interconnection. Fabric-based interconnects have begun to replace the traditional bus system. In a fabric-based interconnect, components communicate through a packet-switched network (fabric) of dedicated point-to-point connections, rather than through a shared bus. Advantages of this approach are that it obviates the need for costly (in terms of performance) bus arbitration protocol and that it makes it possible to “hot-swap” components (i.e., connect or disconnect components while the system is operating).
INFINIBAND® and RAPIDIO™ are two examples of industry-standard fabric-based interconnects. INFINIBAND® is designed primarily to replace backplane buses, such as PCI (Peripheral Component Interconnect) buses, which connect computer systems to external peripherals such as disk drives or other storage devices (a network of this kind is generally referred to as a system area network or, if used for storage, a storage area network, and abbreviated as SAN). RAPIDIO™, on the other hand, is intended for use as an “on-board” or “in-box” interconnect for connecting integrated circuits (such as microprocessors) or other closely-related system components, so as to replace system buses and other intermediate-level interconnects.
The RAPIDIO™ standard is a three-level protocol (compare to the seven-layer OSI [open systems interconnection] model for networking). The layers of the RAPIDIO™ model, from bottom to top, consist of a physical layer, a transport layer, and a logical layer. The logical layer provides an interface with higher-level processes, including system- and user-level software, where applicable. The transport layer handles the task of routing packets from a source to a destination. The physical layer has the ultimate responsibility of moving packets between physical devices. In order to achieve a high level of transparency to higher-level processes, RAPIDIO™ utilizes a “reliable” physical layer protocol. In other words, the RAPIDIO™ physical layer is responsible for insuring that packets are received at their destination without error.
One of the peculiarities of the RAPIDIO™ standard is that when an error occurs at the physical layer and a packet is not accepted by the receiver, the transmitter retries both the unaccepted packet and all packets of equal or lesser priority transmitted subsequent to the unaccepted packet. This can cause a problem, because it sometimes happens that a packet is repeatedly rejected, due to some corruption of the packet itself or unexpected change of operating conditions (e.g., if a RAPIDIO™ device starts rejecting specific classes of packets based on a configuration bit). The result in these instances is that the rejected data packet is perpetually rejected, and all subsequent packets of equal or lesser priority are held up by a potentially infinite loop of packet retries. The way this is dealt with is that a higher-level process must detect the problem through the expiration of a timeout period, then attempt to correct the problem through software. This process can have a devastating effect on system performance, due to the fact that entire classes of packets are stalled within the system until the expiration of some timeout.
What is needed, therefore, is a method of detecting these potentially corrupted packets at the physical layer, so as to reduce the inefficiency associated with relying on a timeout at a higher logical layer to initiate error recovery. The present invention provides a solution to these and other problems, and offers other advantages over previous solutions.