Multiprocessor computers by definition contain multiple processors that can execute multiple parts of a computer program and/or multiple distinct programs simultaneously, in a manner known as parallel computing. In general, multiprocessor computers execute multithreaded-programs and/or single-threaded programs faster than conventional single processor computers, such as personal computers (PCs), that must execute programs sequentially. The actual performance advantage is a function of a number of factors, including the degree to which parts of a multithreaded-program and/or multiple distinct programs can be executed in parallel and the architecture of the particular multiprocessor computer at hand.
Multiprocessor computers may be classified by how they share information among the processors. Shared-memory multiprocessor computers offer a common physical memory address space that all processors can access. Multiple processes and/or multiple threads within the same process can communicate through shared variables in memory that allow them to read or write to the same memory location in the computer. Message passing multiprocessor computers, in contrast, have a separate memory space for each processor, requiring processes in such a system to communicate through explicit messages to each other.
Shared-memory multiprocessor computers may further be classified by how the memory is physically organized. In distributed shared-memory computers, the memory is divided into modules physically placed near each processor. Although all of the memory modules are globally accessible, a processor can access memory placed nearby faster than memory placed remotely. Because the memory access time differs based on memory location, distributed shared memory systems are often called non-uniform memory access (NUMA) machines. By contrast, in centralized shared-memory computers, the memory is physically in one location. Centralized shared-memory computers are called uniform memory access (UMA) machines because the memory is equidistant in time from each of the processors. Both forms of memory organization typically use high-speed cache memory in conjunction with main memory to reduce execution time.
Multiprocessor computers with distributed shared memory are often organized into multiple nodes with one or more processors per node. The nodes interface with each other through a memory-interconnect network by using a protocol, such as the protocol described in the Scalable Coherent Interface (SCI)(IEEE 1596). UMA machines typically use a bus for interconnecting all of the processors.
Further information on multiprocessor computer systems in general and NUMA machines in particular can be found in a number of works including Computer Architecture: A Quantitative Approach (2nd Ed. 1996), by D. Patterson and J. Hennessy, which is hereby incorporated by reference.
While NUMA machines offer significant advantages over UMA machines in terms of bandwidth, they face the prospect of increased delay in some instances if their operating systems do not take into account the physical division of memory. For example, in responding to a system call by a process (a part of a computer program in execution) for allocating physical memory, conventional operating systems do not consider the node location of the process, the amount of free memory on each node, or a possible preference by the process for memory on a specific node in responding to the request. The operating system simply allocates memory for the shared memory object from its global free list of memory. This can result in the process making multiple accesses to remote nodes if the memory is not allocated on the process's node. Or it can result in continual process faults such as page faults and movement of processes into and out of memory ("swapping") if the memory is allocated on a node that has little free memory.
An objective of the invention, therefore, is to provide a method for allocating memory in a multinode multiprocessor system which responds to the communicated physical placement needs of the application program requesting the memory. The program is created by a user such as a computer programmer, and it is believed that the user in many situations knows best how the program should run in the system, and where the physical memory used by the program should be placed.