Modern computer systems are becoming increasingly large and complex. One example of a large computer system is the multi-processor, multi-node system based on the symmetric multiprocessing (SMP) architecture. Prior Art FIG. 1 illustrates an example of an SMP based system. As shown in Prior Art FIG. 1, the typical SMP system 100 may include multiple CPUs (e.g. CPU 0, CPU 1, CPU 2, CPU 3), sharing a same bus 102 for access to a memory 104. In the present example, the CPUs also share a L3 cache 106 and Input/Output (I/O) module 108. The SMP based systems work fine for a relatively small number of CPUs. However, problems appear with the shared bus 102 when the system includes a large number (e.g. dozens) of CPUs.
An alternative architecture designed to overcome the limitations of systems using SMP architecture is the Non-Uniform Memory Access system or NUMA.
Prior Art FIG. 2 shows an example of a NUMA based system architecture. As shown in FIG. 2, in this example of a NUMA based architecture, each node in the system 200 is simply a 4-processor SMP system (e.g. CPUs 202-208 and CPUs 210-216). Each CPU in the node contains a L1 and L2 cache (not shown here), and shares a L3 cache 218 or 220, and a memory 222 or 224. Additionally, CPUs within each node 226-228 may share an I/O device 2230 or 232, and a remote cache 234 or 236.
A node is a region of memory in which every byte within a system has the same distance from each CPU. A more common definition of a node is a block memory, CPUs, and devices etc., physically located on the same bus.
By definition, in a NUMA based system, some regions of the memory require longer access time than others. This may be due to the fact that with respect to the currently running process, data stored in some areas of memory or devices that may be used during the process reside on other nodes. Thus, those parts of the system residing on other nodes or buses are referred to as remote. Correspondingly, areas of the system residing on the same bus are referred to as being local. The notion of distances between system components may be determined in terms various metrics, including hops, latency and bandwidth. The parameter “distance” may also be referred to as system locality information herein this document.
The Advanced Configuration and Power Interface (ACPI) specification was developed to establish industry common interfaces enabling robust operating system (OS)-directed motherboard device configuration and power management of both devices and entire systems. ACPI is the key element in Operating System-directed configuration and Power Management (OSPM).
Advanced Configuration and Power Interface (ACPI) specification version 1.0b assumes that the system is based on an SMP architecture and therefore does not provide the operating system (OS) with locality information about the system it runs on. Thus, the OS would have to assume an SMP architecture, even on a NUMA based system.
With the introduction of the ACPI version 2.0b, some additional proximity indications were provided through the _PXM control method. However, the _PXM method only indicates to the OSPM that certain device modules are “close”. There is no description of the relative distances (e.g. memory latencies) among the device modules.
Microsoft™ designed a static data structure called SRAT (System Resources Affinity Table). It provides a snapshot of the proximity, i.e., whether a device is close, at the system firmware's handoff to the OS. However, as with the _PXM control method, no relative distance information is conveyed.
Thus, with new system architectures being built that stretch the limits of current interfaces (e.g. Plug and Play interfaces), ACPI based mechanisms are needed that can treat newer system architectures such as NUMA in a more robust, and potentially more efficient manner, allowing the OS to optimize system performance.