The basic structure of a conventional multi-processor computer system has several central processing units which are interconnected and connected to common memory such as random-access memory or (RAM) through a storage controller. Such a computer system may have many additional components such as additional memory, and various I/O such as serial and parallel ports for connection to, e.g., modems or printers.
In a multi-processor computer system, all of the central processing units are generally identical; that is, they all use a common set or subset of instructions and protocols to operate and generally have the same architecture. A central processing unit includes a processor core having a plurality of registers, instruction unit which fetches, decodes and issues program instructions, and execution unit, which carry out program instructions in order to operate the computer. The central processing unit may also have one or more caches, such as an instruction cache and a data cache, which are typically implemented using high-speed memory devices. Caches are commonly used to temporarily store values that might be repeatedly accessed by an execution unit, and instruction unit, in order to speed up processing by avoiding the longer step of loading the values from memory (not shown). These caches are referred to as “on-board” or level 1 (L1) when they are integrally packaged with the processor core on a single integrated chip.
A central processing unit in multi-processor system may also include additional caches, such as a level 2 (L2) cache since it supports the on-board (L1) caches and. Where, an L2 cache acts as an intermediary between memory and the on-board caches and, and can usually store a much larger amount of information (instructions and data) than the on-board caches can, but at a longer access time penalty. For example, an L2 cache may be a chip having a storage capacity of 256 or 512 kilobytes, while the central processing unit may have on-board caches with 64 kilobytes of total storage. Although only a two-level cache hierarchy is discussed, multilevel cache hierarchies can be provided where there are many levels (L3, L4, etc.) of serially connected caches.
In a multiprocessor computer system, it is important to provide a coherent memory system, that is, to cause writes to each individual memory location to be serialized in some order for all central processing units. For example, assume a location in memory is modified by a sequence of write operations to take on the values: 1, 2, 3, 4. In a cache-coherent system, all central processing units will observe the writing to a given location to take place in the order shown. However, it is possible for a central processing unit to miss observing a write to the memory location. A given central processing unit reading the memory location could see the sequence 1, 3, 4, missing the update to the value 2. A multiprocessor system that implements these properties is said to be “coherent.”