High performance computing (“HPC”) or “supercomputer” systems are used to perform computations that require large quantities of computing resources. HPC systems may be used, for example, in weather forecasting and aerodynamic modeling, cryptography and code breaking, simulation of nuclear weapons testing or molecular dynamics, and ‘big data’ analytics. These applications may require large amounts of memory or data storage, and large numbers of (or extremely fast) memory accesses or computational operations. Often, these large amounts of memory or data storage are provided by network many computers together. Some clustered HPC systems provide federated memory using non-uniform memory access (“NUMA”), which allows each node to access the memory of some or all of the other nodes.
There are two main paradigms used to design HPC systems: scale-out and scale-up, which roughly correspond to the ideas of ‘bigger’ and ‘better’. Scale-out systems are ‘bigger’, in the sense that they network many commodity computing devices (such as retail server computers) in a cluster. By contrast, scale-up systems are ‘better’, in the sense that they embody better, often cutting-edge technology: faster processors, faster memory, larger memory capability, and so on.