1. Field of the Invention
The present invention relates generally to allocation of memory in a computer system with distributed memory, and more particularly to a method for representing the locality of memory for a multi-processor non-uniform memory access (NUMA) computer system.
2. Related Art
A distributed memory computer system typically includes a plurality of physically distinct and separated processing nodes. Each node has one or more processors, input output (I/O) devices and main memory that can be accessed by any of the processors. The main memory is physically distributed among the processing nodes. In other words, each processing node includes a portion of the main memory. Thus, each processor has access to "local" main memory (i.e., the portion of main memory that resides in the same processing node as the processor) and "remote" main memory (i.e., the portion of main memory that resides in other processing nodes).
For each processor, the latency associated with accessing local main memory is significantly less than the latency associated with accessing remote main memory. Further, for many NUMA systems, the latency associated with accessing remote memory increases as the topological distance between the node making a memory request (requesting node) and the node servicing the memory request (servicing node) increases. Accordingly, distributed memory computer systems as just described are said to represent non-uniform memory access (NUMA) computer systems.
In NUMA computer systems, it is desirable to store data in the portion of main memory that exists in the same processing node as the processor that most frequently accesses the data (or as close as possible to the processor that most frequently accesses the data). Accordingly, it is desirable to allocate memory as close as possible to the processing node that will be accessing the memory. By doing this, memory access latency is reduced and overall system performance is increased.
Therefore, controlling memory management is an essential feature in multi-processor systems employing NUMA architectures. In conventional systems, the operating system typically controls memory management functions on behalf of application programs. This is typically accomplished through the use of predetermined memory management procedures designed to produce a certain level of locality. For example, such procedures include program code to accomplish page migration and page replication. In this fashion, data is dynamically moved and/or replicated to different nodes depending on the current system state. However, such predetermined operating system procedures may not be optimal for all types of program applications.
Thus, what is needed is a system and method for producing a high degree of locality in a NUMA system that works well with a variety of different types of application programs.