The evolution and popularity of computing devices and networking place an ever increasing burden on data servers, application processors, and enterprise computers to reliably move greater amounts of data between processing nodes as well as between a processor node and input/output (I/O) devices. These trends require higher bandwidth and lower latencies across data paths and place a greater functional burden on I/O devices, while simultaneously demanding increased data protection, higher isolation, deterministic behavior, and a higher quality-of-service than that which until recently has been unavailable.
The InfiniBand™ architecture specification describes a first-order interconnect technology for interconnecting processor nodes and I/O nodes in a system-area network. The architecture is independent of the host operating system and processor platform. The InfiniBand™ architecture (IBA) is designed around a point-to-point switched I/O fabric, where end-node devices, which can range from inexpensive I/O devices such as single integrated-circuit small-computer-system interface (SCSI) or ethernet adapters to complex host computers, are interconnected by cascaded switch devices. The IBA defines a switched communications fabric allowing multiple devices to concurrently communicate with high bandwidth and low latency in a protected and remotely managed environment. The physical properties of the IBA interconnect support module-to-module connectivity, as typified by computer systems that support I/O module slots as well as chassis-to-chassis connectivity as typified by interconnecting computers, external data storage systems, local-area network (LAN) and wide-area network (WAN) access devices such as switches, hubs, and routers in a data center environment.
The IBA switched fabric provides a reliable transport mechanism where messages are queued for delivery between end nodes. Message content is left to the designers of end-node devices. The IBA defines hardware-transport protocols sufficient to support both reliable messaging (e.g., send/receive) and memory-manipulation semantics (e.g., remote direct memory access (DMA)) without software intervention in the data movement path. The IBA defines protection and error-detection mechanisms that permit IBA-based transactions to originate and terminate from either privileged kernel mode, to support legacy I/O and communication needs, or user space to support emerging interprocess communication demands.
Concerning error-detection and recovery mechanisms, the IBA requires implementation of two port-level counters for reporting packet-switching errors as well as a port state change error that initiates a switch interrupt. The counters receive numerous separate error-signal inputs that the IBA specification treats as a single error. This error-reporting methodology lacks the resolution to provide accurate information as to what condition in the switch actually caused the port-error counter to increment and does not provide a mechanism for communication path management.
Consequently, there is a need for solutions that address these and/or other shortcomings of the prior art, while providing a manufacturable working device compliant with the IBA error reporting standard.