Field
The present application concerns the cabling of complex computer systems such as clusters and more particularly a method and device for managing cabling in a cluster.
Description of the Related Art
HPC (standing for High Performance Computing) is being developed for university research and industry alike, in particular in technical fields such as aeronautics, energy, climatology and life sciences. Modeling and simulation make it possible in particular to reduce development costs and to accelerate the placing on the market of innovative products that are more reliable and consume less energy. For research workers, high performance computing has become an indispensable means of investigation.
This computing is generally conducted on data processing systems called clusters, A cluster typically comprises a set of interconnected nodes. Certain nodes are used to perform computing tasks (compute nodes), others to store data (storage nodes) and one or more others manage the cluster (administration nodes). Each node is for example a server implementing an operating system such as LINUX (LINUX is a trademark). The connection between the nodes is, for example, made using ETHERNET or INFINIBAND communication links (ETHERNET and INFINIBAND are trademarks). Each node generally comprises one or more microprocessors, local memories and a communication interface.
FIG. 1 is a diagrammatic illustration of an example of a topology 100 for a cluster of fat-tree type. The latter comprises a set of nodes of general reference 105. The nodes belonging to the set 110 are compute nodes here whereas the nodes of the set 115 are service nodes (storage nodes and administration nodes). The compute nodes may be grouped together in sub-sets 120 called compute islands, the set 115 being called a service island.
The nodes are linked together by switches, for example hierarchically. In the example illustrated in FIG. 1, the nodes are connected to first level switches 125 which are themselves linked to second level switches 130 which in turn are linked to third level switches 135.
There is thus a large set of physical links between the different components of a cluster to enable data exchanges between the nodes. Such physical links are, for example, copper wire conductors or optic fibers. Depending, in particular, on the topologies employed and the required performance, different types of links may be used within the same cluster.
Furthermore, other types of links are necessary for the implementation of a cluster, for example electrical power supplies.
Collectively, these links are generally referred to as the cabling of the cluster. They are put in place by technicians on installation of the cluster. This operation is also referred to as the cabling of the cluster.
It is noted here that the nodes of a cluster are often grouped together in racks, which may themselves be grouped together into islands. By way of illustration, a cluster comprising thirty racks each comprising 48 nodes requires several tens of thousands of cables the total length of which may attain several tens of kilometers.
Generally, the cabling is carried out on the basis of the technicians' know-how. However, such a method has numerous drawbacks. Among those drawbacks are the fact that such know-how requires qualified technicians who must be available at a given time to install a cluster. Furthermore, they must be trained and must be able to pass on their knowledge. Moreover, there is generally no general cabling diagram, which poses problems when the performance of the cluster is not as expected.
The present application provides a solution to at least one of the problems set forth above.