Prior art cache line replacement algorithms typically do not take into account the effect of an eviction of a cache line in one level of cache upon a corresponding cache line in another level of cache in a cache hierarchy. In inclusive cache systems containing multiple levels of cache within a cohesive cache hierarchy, however, a cache line evicted in an upper level cache, for example, can cause the corresponding cache line within a lower level cache to become invalidated or evicted, thereby causing a processor or processors using the evicted lower level cache line to incur performance penalties.
Inclusive cache hierarchies typically involve those containing at least two levels of cache memory, wherein one of the cache memories (i.e. “lower level” cache memory) includes a subset of data contained in another cache memory (i.e. “upper level” cache memory). Inclusive cache hierarchies are useful in microprocessor and computer system architectures, as they allow a smaller cache having a relatively fast access speed to contain frequently used data and a larger cache having a relatively slower access speed than the smaller cache to store less-frequently used data. Inclusive cache hierarchies attempt to balance the competing constraints of performance, power, and die size by using smaller caches for more frequently used data and larger caches for less frequently used data.
Because inclusive cache hierarchies store at least some common data, evictions of cache lines in one level of cache may necessitate the corresponding eviction of the line in another level of cache in order to maintain cache coherency between the upper level and lower level caches. Furthermore, typical caching techniques use state data to indicate the accessibility and/or validity of cache lines. One such set of state data includes information to indicate whether the data in a particular cache line is modified (“M”), exclusively owned (“E”), able to be shared among various agents (“S”), and/or invalid (“I”) (“MESI” states).
Efficient cache operation utilizes cache management techniques for replacing cache locations in the event of a cache miss. In a typical cache miss, the address and data fetched from the system or main memory is stored in cache memory. However, the cache needs to determine which cache location is to be replaced by the new address and data from system memory. One technique for replacing cache locations is implementing a protocol with least recently used (LRU) bits. Least recently used bits are stored for each cache location and are updated when the cache location is accessed or replaced. Valid bits determine the coherency status of the respective cache location. Therefore, based on the value of the least recently used bits and the valid bits, the cache effectively replaces the cache locations where the least recently used bits indicate the line is the least recently used or the line is not valid. There is a variety of replacement protocols utilized by cache memory, such as, pseudo-LRU, random, and not recently used (NRU) protocols. However, the present replacement protocols may result in increased inter-cache traffic. For example, replacing a line from an inclusive last level cache requires the same line to be evicted from all the lower level caches. Therefore, this results in increased inter-cache traffic.
Inter-cache traffic due to cache line evictions in upper level caches can be exacerbated in multi-core processors or multi-processor computer systems, in which multiple processing elements (cores or processors) share the same inclusive cache. FIG. 1 illustrates a typical prior art 2-level cache hierarchy, in which two lower level caches, such as level-1 (“L1”) caches corresponding to two processor cores, respectively, contains a subset of data stored in an upper level cache, such as a level-2 (“L2”) cache. Each line of each L1 cache of FIG. 1 typically contains MESI state data to indicate to requesting agents the availability/validity of data within a cache line. Cache data and MESI state information is maintained between the L1 caches and L2 cache via coherency information between the cache levels. FIG. 1 further illustrates an LRU replacement hierarchy to determine which cache way is to be replaced. In the prior art example of FIG. 1, the LRU chooses which way of the L2 cache to evict without regard to the traffic the coherency traffic the eviction may cause between the L2 cache and lower level caches, such as the L1 caches.
Accordingly, cache line eviction techniques that do not take into account the effect of a cache line eviction on traffic among lower level cache structures within the cache hierarchy can cause a processor or processors having access to the lower level cache to incur performance penalties.