The use of a large number of multi-core processors combined with centralization techniques continue to increase in popularity for applications that feature computationally intensive tasks. For example, systems implemented with a large number of compute nodes disposed in proximity to each other, and coupled via high-speed interconnects, are particularly well suited for applications such as quantum mechanics, weather forecasting, climate research, oil and gas exploration, and molecular modeling, just to name a few. These multi-node systems may provide processing capacity many orders of magnitude greater than that of a single computer. This gap grows exponentially each year. For example, some multi-node systems have processing capacity (generally rated by floating pointing operations per second (FLOP)), in the petaflops range.
This pursuit of increased performance has led to approaches including massively parallel systems featuring a large number of compute nodes, with each node providing one or more processors, memory, and an interface circuit connecting the node to a multi-node network. The processing capacity of a given multi-node network can scale based on adding additional nodes. However, as multi-node systems approach exascale, or a billion billion calculations per second, the complexity of addressing large numbers of nodes raises numerous non-trivial challenges.
These and other features of the present embodiments will be understood better by reading the following detailed description, taken together with the figures herein described. The accompanying drawings are not intended to be drawn to scale. In the drawings, each identical or nearly identical component that is illustrated in various figures is represented by a like numeral. For purposes of clarity, not every component may be labeled in every drawing.