High Performance Computing (HPC) systems generally employ relatively large numbers of processing nodes that are interconnected in a network fabric of switches. In some HPC systems there may be hundreds of thousands of processors and tens of thousands of switches, or more, in a network. The networks may be arranged in any of a wide variety of known topologies including, for example, Dragonfly, Fat-Tree, Flattened Butterfly and Torus topologies, which may be configured in 2, 3, or more dimensions. Data may be transmitted between processors, for example in packets, which are routed through the switches by dynamically connecting one of a number of switch input ports to one of a number of switch output ports, for example based on an address associated with the packet. Different networks may be configured to support different routing and addressing modes.
Existing switches are typically designed to target a specific network topology or a small subset of topologies. These switches may not be well suited to handle other types of network configurations and addressing modes. Additionally, existing switches generally lack the capacity to efficiently deal with network faults (e.g., broken links and/or switches), requiring network management software to overcompensate for these faults by disabling relatively large portions of the network to prevent a packet from taking a path that leads to a dead end.
Although the following Detailed Description will proceed with reference being made to illustrative embodiments, many alternatives, modifications, and variations thereof will be apparent to those skilled in the art.