Computer systems are formed by hardware and software architectures. Hardware architectures deal with how different resources, such as processing power, memory, networking interface and the like, are interconnected to, e.g. in terms of physical formats, number of wires, each other. Software architectures deal with how different programs, such as operating systems, applications, applets, virtual machines and more, are executed on the hardware architecture.
Traditional hardware architectures, used for e.g. a Data Center, a cloud computing system, are typically built up by a plurality of racks, such as cabinets, that are networked together. Each rack comprises one or more fully functional computers, e.g. embodied as one or more server blades. Hence, each server blade is self-contained with resources, such as processors, memory, storage, networking interface and Input/Output (I/O) peripherals. An issue with a server blade is its static nature with regard to composition of resources. This implies that once the server blade has been assembled, processing capacity, memory capacity, network interface capacity etc. cannot be upgraded without physical intervention with the server blade, e.g. memory capacity could be upgraded by manually inserting more memory into the server blade.
In order to solve this issue, and other issues, disaggregated hardware architectures have emerged. A disaggregated hardware architecture, such as Intel Rack Scale architecture and HyperScale Datacenter Systems, separates the resources—which with the traditional hardware architecture would have been confined within one blade—that make up a hardware machine, such a server. The separated resources are organized into pools. Then, a host machine is allocated by picking resources of from the pools.
An exemplifying known disaggregated hardware system 1 is shown in FIG. 1. The known disaggregated hardware system 1 comprises an interconnect 2, such as a superfast optical fiber connectivity. The interconnect 2 interconnects a Central Processing Unit (CPU) pool 3, a memory pool 4 and a storage pool 5. The memory pool 4 may refer to short-term memories, such as cache memory or the like, whereas the storage pool 5 may refer to long-term storage, such as hard drives, magnetic tape, etc. Here, long-term and short-term shall be considered in relation to each other. Typically, each pool comprises one blade. With this set up, e.g. the CPU pool 3 and the disk pool 5 will be available also during replacement of the memory pool 4, while it may be assumed that other memory pools (not shown) may support, at least during the replacement, any need of memory that the CPU pool 3 and the disk pool 5 may have. The CPU pool 3 comprises CPUs, the memory pool 4 comprises memory units, and the disk pool comprises disk units, all shown as rectangles in their respective pool. A Host Machine Manager 6 handles allocation of host machines. In this example, three host machines 10, 20 and 30 are illustrated in the Figure.
Since the resources of the disaggregated hardware system are clubbed into pools, access latency between resources in each pool is not uniform. For example, access latencies between a CPU in the CPU pool will not be uniform toward different memory units in a memory pool. Therefore, when the Host Machine Manager 6 has picked a first host machine 10 and a second host machine 30 as shown in FIG. 1, varying latencies of memory access for the CPUs of the first and second host machine 10, 30, can lead to that the two host machines 10, 30, which otherwise e.g. have identical numbers of CPUs and memory units, have different performance.
A problem is hence how to ensure predictable and/or desired performance of host machines in a disaggregated hardware system of the above mentioned kind.
Now returning to the traditional hardware architecture, in case of a Non-Uniform Memory Access (NUMA) architecture, a known server blade has a CPU which is connected to local memory units and remote memory units. Since latency for the CPU to access the local memory units is less that of accessing the remote memory units, memory is allocated to local memory units as far as possible. In this manner, performance is kept as high as possible.