1. Field of the Invention
The present invention relates to multiprocessor systems.
2. Related Art
Centralized shared-memory multiprocessor systems, such as, CHALLENGE.TM. and POWER CHALLENGE.TM. systems manufactured by Silicon Graphics, Inc., use a common bus to link multiple processors and a single shared memory. Contention for bus bandwidth and memory access can limit the number of processors (also called the CPU count) which can effectively share a common bus. The size of a single shared memory also limits the ability to scale a centralized-shared-memory multiprocessor system to higher CPU counts.
A distributed shared memory (DSM) architecture, such as, a scalable shared-memory system or a non-uniform memory access (NUMA) system, typically includes a plurality of physically distinct and separated processing nodes each having one or more processors, input/output devices and main memory that can be accessed by any of the processors. The main memory is physically distributed among the processing nodes. In other words, each processing node includes a portion of the main memory. Thus, each processor has access to "local" main memory (i.e., the portion of main memory that resides in the same processing node as the processor) and "remote" main memory (i.e., the portion of main memory that resides in other processing nodes). For each processor in a distributed shared memory system, the latency associated with accessing a local main memory is significantly less than the latency and/or bandwidth associated with accessing a remote main memory. See D. Lenoski and W. Weber, Scalable Shared-Memory Multi-Processing, Morgan-Kaufmann Publ., U.S.A. (1995), pp. 1-40, 87-95, 143-203, and 311-316, and Hennessy and Patterson, Computer Architecture: A Quantitative Approach, Second Edition, Morgan-Kaufmann Publ., U.S.A. (1996), at Chapter 8, "Multiprocessors," pp. 634-760.
On a centralized shared memory system an application's performance is typically not affected by the physical location of memory pages which the application uses. On a distributed shared memory system having non-uniform memory access times, e.g., a NUMA machine, this is not the case. Only the user really understands his or her application's needs and how data should optimally be distributed to minimize communication costs and maximize performance.