The present invention relates generally to processing within a computing environment, and more specifically, to a computing system with a multilevel cache hierarchy.
A cache is generally a memory that stores copies of data from the most frequently used system memory locations such that future requests for data may be served faster. A multiprocessor computing system includes multiple processing units that are coupled to one another, and share a system memory. In order to reduce access latency to data and instructions residing in the system memory, each processing unit may be supplied with a multi-level cache hierarchy. For example, a level one (L1) cache may have a lower access latency than a level two (L2) cache, the L2 cache may have a lower access latency than a level three (L3) cache, and the L3 cache may have a lower access latency than a level four (L4) cache. Cache operations in a multilevel cache hierarchy are controlled by a cache controller. Within a cache, data are organized and tracked on a cache line basis, where a typical cache line contains a fixed number of bytes, for example, 256 bytes. Each level of cache has an associated directory to keep track of which lines of cache are stored in the specific cache.
In the event a cache miss occurs in a multiprocessor system, the cache controller initiates a fetch operation to acquire the requested cache line. A cache miss occurs when a particular line of data causes a search of the associated directory, and the requested line of cache is not present. In one approach to obtain the requested cache line, a fetch operation for the missing cache line may be simultaneously launched to other caches or nodes as well as to the system memory. The latency for a fetch to another cache is generally considerably less than a fetch to the system memory. Thus, launching fetches to both the system memory and other caches improves latency, but the launch will utilize both inter-nodal busses as well as system memory access busses.
In another approach to obtain the requested cache line, the cache controller may initiate a fetch request to only the other caches first, which reduces unnecessary usage of associated buffers and control logic needed for system memory fetches. However, the cache controller is unable to determine ahead of time if a fetch to the caches or nodes will be successful. Thus, the cache controller has to wait to determine if the fetch is successful before initiating a fetch operation to the system memory, which increases latency.