Multiprocessor computers by definition contain multiple processors that can execute multiple parts of a computer program or multiple programs simultaneously. In general, this parallel computing executes computer programs faster than conventional single processor computers, such as personal computers (PCs), that execute the parts of a program sequentially. The actual performance advantage is a function of a number of factors, including the degree to which parts of a program can be executed in parallel and the architecture of the particular multiprocessor computer at hand.
Multiprocessor computers may be classified by how they share information among the processors. Shared-memory multiprocessor computers offer a common memory address space that all processors can access. Processes within a program communicate through shared variables in memory that allow them to read or write to the same memory location in the computer. Message passing multiprocessor computers, on the other hand, have a separate memory space for each processor. Processes communicate through messages to each other.
Shared-memory multiprocessor computers may also be classified by how the memory is physically organized. In distributed shared-memory computers, the memory is divided into modules physically placed near each processor. Although all of the memory modules are globally accessible, a processor can access memory placed nearby faster than memory placed remotely. Because the memory access time differs based on memory location, distributed shared-memory systems are often called non-uniform memory access (NUMA) machines. By contrast, in centralized shared-memory computers, the memory is physically in just one location. Such centralized shared-memory computers are called uniform memory access (UMA) machines because the memory is equidistant in time and space from each of the processors. Both forms of memory organization typically use high-speed cache memory in conjunction with main memory to reduce execution time.
Multiprocessor computers with distributed shared memory are often organized into nodes with one or more processors per node. The nodes interface with each other through a network by using a protocol, such as the protocol described in the Scalable Coherent Interface (SCI)(IEEE 1596). An operating system is located on the system. The operating system is a program that performs a number of tasks central to the computer's operation including managing memory, files and peripheral devices, launching application programs, and allocating system resources.
The operating system typically implements a process model. A user process (i.e., a process from an application program) provides an execution environment for a program and allows the program to make requests (also called system calls) to a kernel (which is the heart of the operating system) through an application programming interface (API). The system calls allow the user process to control the multiprocessor computer so that user "jobs" are carried out. For example, a user process might desire access to system resources, such as an I/O device (e.g., disk drive, tape drive, CD ROM, etc.), a shared memory segment, a file, a processor, another process, etc. A user process has several components including the program itself (i.e., executable instructions also called "text"), private data (e.g., local variables), a stack, and page tables.
A problem arises when running an operating system on a multinode environment. That is, the user has knowledge of what system resources the process needs, but does not know which nodes those resources are located on. The operating system, on the other hand, knows where the resources are located, but does not know what resources a process needs until a system call is made to access the resource. Consequently, the operating system may create and move processes somewhat randomly and independent of future process needs. For example, when a process is first created by the operating system, the process may be stored on a different node from a resource that it frequently accesses. The memory associated with the process also may be located on a different node from the processor that is executing the process. Additionally, components of the process may be split and distributed amongst different nodes in the computer system. For example, the stack may be located on a different node from the program. The page tables and private data may be located on yet another node. Such random placement of process components leads to inefficiencies in program execution requiring a large number of internode memory accesses.
An objective of the invention, therefore, is to provide a distributed shared-memory multiprocessor computer system that maximizes efficiency by storing a user process near a system resource the process frequently accesses. A further objective of the invention is to allow a user to control or advise the operating system where to store processes or what resources the process frequently accesses. Yet a further objective of the invention is to provide such a system where components of the process (e.g., stack, page tables, etc.) are stored on one node.