Most general purpose computer systems are built around a general-purpose processor, which is typically an integrated circuit operable to perform a wide variety of operations useful for executing a wide variety of software. The processor is able to perform a fixed set of instructions, which collectively are known as the instruction set for the processor. A typical instruction set includes a variety of types of instructions, including arithmetic, logic, and data instructions.
In more sophisticated computer systems, multiple processors are used, and one or more processors runs software that is operable to assign tasks to other processors or to split up a task so that it can be worked on by multiple processors at the same time. In such systems, the data being worked on is typically stored in memory that is either centralized, or is split up among the different processors working on a task.
Instructions from the instruction set of the computer's processor or processor that are chosen to perform a certain task form a software program that can be executed on the computer system. Typically, the software program is first written in a high-level language such as “C” that is easier for a programmer to understand than the processor's instruction set, and a program called a compiler converts the high-level language program code to processor-specific instructions.
In multiprocessor systems, the programmer or the compiler will usually look for tasks that can be performed in parallel, such as calculations where the data used to perform a first calculation are not dependent on the results of certain other calculations such that the first calculation and other calculations can be performed at the same time. The calculations performed at the same time are said to be performed in parallel, and can result in significantly faster execution of the program. Although some programs such as web browsers and word processors don't consume a high percentage of even a single processor's resources and don't have many operations that can be performed in parallel, other operations such as scientific simulation can often run hundreds or thousands of times faster in computers with thousands of parallel processing nodes available.
The processors share data by passing messages back and forth, or by sharing memory between processors. In one shared memory system, each memory address identifies a unique memory location within the computer system, while in other systems some or all memory addresses identify memory local to a processor, and so refer to different memory locations that hold different data in different processors.
The word size of the processor, such as 32-bit or 64-bit words or operands, often also defines the amount of memory that can be directly addressed in the computer system. For example, 32-bit word can identify only 232 or four GigaBytes of memory, while a 64-bit computer can directly address 264 or 16 ExaBytes of memory. Modern computers sometimes use address spaces that are larger or smaller than the word size, such as a 16-bit 8086 processor that uses 20-bit addressing to provide access to one MegaByte of data, or a 64-bit AMD64 processor that supports only 48-bit addressing, recognizing that 256 TeraBytes of memory is likely sufficient and that limiting addressable memory to 48 bits rather than 64 can save complexity and time in memory operations such as address translation and memory page lookup.
It is desirable to manage memory architecture in computer systems for these and other reasons.