High performance computing systems, such as server systems or server racks, are typically implemented using processing nodes connected together by one or more fabric interconnects. These processing nodes often are implemented as field replaceable units (FRUs), which typically constitute a card, a blade, or another pluggable form, that may be quickly swapped in and out of corresponding sockets of the interconnect, which allows the FRUs to be easily replaceable and upgradeable and facilitates adaptation and scaling of the high performance computing system.
In order to maintain a high level of flexibility, the FRUs can share a common configuration, such that an FRU can be installed into any socket at any location in the interconnect, and can operate there without having to be specially configured to operate in that socket. However, many network topologies, such as ring networks and torus networks, do not have any fixed absolute reference points and thus, because the FRUs can be installed in any socket in the interconnect, it is often difficult to identify the particular position within the network in the absence of such an absolute reference point, or to identify the particular location of the FRU in the interconnect. It may be desirable to determine an absolute reference point of an FRU within a ring in order to determine a physical location of the FRU in a system that implements a ring network or a torus network. For example, it may be desirable to provide an indication of a location for a particular FRU, such as for an FRU that has experienced a fault condition, in order to quickly identify the FRU for replacement or service, to replace a disk drive on the FRU, or to change a cabled connector to the FRU.
One conventional approach for signaling the position of an FRU within the network includes providing dedicated position identification pins in the sockets and corresponding traces in the hardwiring of the interconnect. However, this approach is unduly complex and expensive to implement as the number of FRUs increases. For example, a server of 64 processing nodes would require 6 dedicated position identification pins when using a binary encoding scheme (26=64) and a corresponding number of traces in the interconnect to encode the particular position of each socket. Another conventional approach includes providing logic and a storage element at each socket that stores a corresponding position identifier such that when an FRU is inserted into a given socket, the logic at the socket provides the corresponding position identifier to the FRU. Such logic at the socket is sometimes implemented with a centralized controller that has individual point-to-point connectivity to each socket and is able to provide a dedicated position identifier to each FRU. However, the need to implement logic and a storage element at each socket of the interconnect increases the cost and complexity of the interconnect.
The use of the same reference symbols in different drawings indicates similar or identical items.