Field of the Invention
The field of the invention is data processing, or, more specifically, methods, apparatus, and products for main memory operations in a symmetric multiprocessing (‘SMP’) computer.
Description of Related Art
Contemporary high performance computer systems are typically implemented as SMP computers. SMP is a multiprocessor computer hardware architecture where two or more, often many more, identical processors are connected to a single shared main memory and controlled by a single operating system. Most multiprocessor systems today use an SMP architecture. In the case of multi-core processors, the SMP architecture applies to the cores, treating them as separate processors. Processors may be interconnected using buses, crossbar switches, mesh networks, and the like. In addition to shared main memory access, each processor also accelerates memory access with cache memory. Cache architectures are typically multi-level. Caches can be local to each processor, shared across more than one processor, or even shared across compute nodes in a multi-node architecture.
Traditional multi-level cache architectures are configured so that requests are forwarded from one level of cache to the next, busying system resources at they traverse the hierarchy for the duration of any memory operation for two main reasons: (1) simplification of system interlocks and protocols and (2) simplification of hardware design and implementation. While the traditional approach to request handling has been acceptable for normal processor fetch type operations as the request completion follows the data movement and limitations in the number of fetches initiated by all cores is bound by the number of L1/L2 fetch state machines. For high bandwidth fetch operations that ultimately require main memory access, this extra delay in interlock response times and resource availability actually reduces the overall throughput capability of the system. Moreover, as the latency from processor to main storage has increased from generation to generation, while with the number of intervening cache levels and the number of resources within each cache level has remained relatively constant (on a per processor basis), the cascading effects of request response time on resource availability starts to become problematic for memory operations that require main memory access.