Given the high processing speeds that can be achieved by today's computer processing units, memory access speed has become a limiting factor in many computer systems. In order to reduce memory latency and avoid “hot spots” in which certain memory resources are overly taxed, many computer systems employ a shared memory system in which the memory is divided into multiple blocks, and where multiple processing units are allowed to access the same blocks of the memory at different or even substantially the same times. In some such computer systems, each block of memory is controlled by a respective memory controller that is capable of communicating with multiple processing units of the computer system.
Some computer systems employ sockets that each have multiple processing units and, in addition, also typically each have their own respective memory controllers that manage blocks of memory capable of being accessed by one or more of the processing units of the respective sockets. To reduce memory latency in some such systems, processing units located on a given socket may be able to access memory blocks controlled by memory controllers located on other sockets. Such operation, in which one socket directly accesses the memory resources of another socket, is commonly referred to as “memory interleaving”, and systems employing such interleaving capability are commonly referred to as non-uniform memory access (NUMA) systems.
Yet the degree to which memory interleaving can be effectively implemented in conventional computer systems is limited. Memory interleaving as described above is typically restricted to small numbers of sockets, for example, to four sockets or less. To achieve systems having larger numbers of sockets that are capable of accessing each other's memory resources, the memory controllers of the sockets cannot be directly connected to the processing units of other sockets but rather typically need to be connected by way of processor agents. Yet the implementation of such systems employing processor agents tends to be complicated and inefficient both in terms of the operation of the processor agents and in terms of the extra burdens that are placed upon the operating system and applications running on such systems. For example, in such systems it is desirable that the operating system/applications be capable of adapting to changes in the memory architecture to avoid inefficient operation, something which is often difficult to achieve.
Additionally, it is increasingly desired that computer systems be scalable and otherwise adjustable in terms of their sockets (e.g., in terms of processing power and memory). For example, it may in one circumstance be desirable that a computer system utilize only a small number of sockets but in another circumstance become desirable or necessary that the computer system be modified to utilize a larger number of sockets. As the computer system is modified to include or not include larger numbers of sockets, a given manner of interleaving suited for either smaller or larger numbers of sockets may become more or less effective. Again for example, supposing that such a computer system employs the manner of interleaving described above as involving direct contact (not involving processor agents, sometimes referred to as glueless) among four or less sockets, the computer system's memory access performance may vary significantly as the computer system is modified between utilizing four or less sockets and greater than four sockets.
For at least these reasons, it would be advantageous if an improved system and method for achieving enhanced memory access capabilities in computer systems could be developed. More particularly, it would be advantageous if, in at least some embodiments, such a system and method enabled enhanced memory interleave capabilities in computer systems having large numbers of sockets with multiple processors and memory controllers, such that the processors of the various sockets could access different memory blocks controlled by memory controllers of other sockets in a manner that, in comparison with conventional systems, reduced memory latency and/or the occurrence of “hot spots”. Additionally, it would be advantageous if, in at least some embodiments, such a system and method was capable of achieving satisfactory levels of memory interleave capabilities even where the number and/or type of system resources such as processors and memory devices being utilized by the system varied during system operation.