1. Field of the Invention
The present invention relates generally to allocation of memory in a computer system with distributed memory, and more particularly to replication of memory pages in a distributed, non-uniform memory access (NUMA) computer system.
2. Related Art
A distributed memory computer system typically includes a plurality of physically distinct and separated processing nodes. Each node has one or more processors, input output (I/O) devices and main memory that can be accessed by any of the processors. The main memory is physically distributed among the processing nodes. In other words, each processing node includes a portion of the main memory. Thus, each processor has access to "local" main memory (i.e., the portion of main memory that resides in the same processing node as the processor) and "remote" main memory (i.e., the portion of main memory that resides in other processing nodes).
For each processor, the latency associated with accessing local main memory is significantly less than the latency associated with accessing remote main memory. Further, for many NUMA systems, the latency associated with accessing remote memory increases as the topological distance between the node making a memory request and the node servicing the memory request increases. Accordingly, distributed memory computer systems as just described are said to represent non-uniform memory access (NUMA) computer systems.
In NUMA computer systems, it is desirable to store data in the portion of main memory that exists in the same processing node as the processor that most frequently accesses the data (or as close as possible to the processor that most frequently accesses the data). Accordingly, it may be desirable to replicate data to local processing nodes requesting data that resides in remote memory locations. By doing this, memory access latency is reduced and overall system performance is increased.
In addition, nodes in NUMA systems are interconnected via node links having a finite communication bandwidth. As such, large performance penalties are imposed when these links become overly congested with data traffic causing a decrease in the link bandwidth. This can occur, for example, when multiple processors attempt to access the same section of memory at the same time. This phenomenon is referred to as contention. A memory section that is simultaneously accessed by a large number of processors is referred to as a hotspot.
Memory replication can be used to alleviate the latency and link bandwidth problems as described above. Accordingly, data is replicated (i.e. copied) to portions of main memory that is closer to each of the processing nodes requesting the data. In this fashion, overall system performance is improved because further data accesses will be local rather than remote.
However, data replication in and of itself, can cause system problems and should not be made indiscriminately. For example, replicating large amounts of data can result in depleting the limited memory resources of the overall NUMA system. Accordingly, the benefits gained by replicating data need to be weighed against possible detrimental effects that can be caused by such replication. That is, the degree of replication needs to be monitored based on the current system state.
Thus, what is needed is an intelligent system and method for determining what pages to replicate, when to replicate pages and the degree of replication, given the current system resource state.