As the size of the high-performance computing systems grows, the probability of the events requiring network reconfiguration increases. The reconfiguration of interconnection networks, like InfiniBand (IB) networks, often requires computation and distribution of a new set of routes in order to maintain connectivity, and to sustain performance. This is the general area that embodiments of the invention are intended to address.
In large systems, the probability of component failure is high. At the same time, with more network components, reconfiguration is often needed to ensure high utilization of available communication resources. For example, Exascale computing imposes unprecedented technical challenges, including realization of massive parallelism, resiliency, and sustained performance. The efficient management of such large systems is challenging, requiring frequent reconfigurations.