1. Field of the Invention
This invention relates generally to multiple levels of cache. More particularly, this invention relates to identifying a cache line in a higher level cache that is a candidate for eviction in the higher level cache when a new cache line must be loaded into the higher level cache.
2. Description of the Related Art
Modern computer systems typically use multiple levels of cache. For example, a very fast, but relatively small, first level cache is typically implemented on a same semiconductor chip as a processor, and provides data to the processor within one or two processor cycles. The first level cache (L1 cache) is usually implemented using static random access memory (SRAM) that is very fast, but not as compact as larger, slower, memory. The first level cache must be relatively small, also, to limit length of control, address, and signal interconnect. A second level cache (L2 cache) is often also implemented on the same semiconductor chip as the processor in modern computer systems. The second level cache is often also built using SRAM memory. The second level cache is typically larger than the first level cache both in physical area and in amount of data that is stored. The second level cache is typically slower to access (read or write) than the first level cache. Modern computer systems also comprise a third level cache (L3 cache) that holds even more data than the second level cache and takes even longer to access. Often, the third level cache is implemented with dynamic random access memory (DRAM), although SRAM memory is also sometimes used for third level cache designs.
A cache stores data in blocks called cache lines. For example, in various computer systems, a cache line might be 64 bytes, 128 bytes, 256 bytes, and so on. A cache line is stored in a cache line location in a cache based upon an address of the cache line and replacement logic coupled to the cache. A cache directory coupled to the cache maintains state information and tag information for each cache line stored in the cache.
When a processor requests a piece of data at a particular address, the computer system checks if the data is stored in the first level cache. The particular address is presented to a first level cache directory which determines if the data is stored in the first level cache. If a cache line containing the piece of data exists in the first level cache, the data will be fetched from the first level cache for use by the processor; this is known as a cache hit in the first level cache. If the cache line containing the piece of data is not held in the first level cache, a cache miss is reported by the first level cache. A request is then made to a second level cache. If the second level cache holds the particular piece of data, the cache line containing the particular piece of data is fetched from the second level cache and stored in the first level cache. In many implementations, the particular piece of data is made available to the processor while the cache line containing the particular piece of data is being written into the first level cache. If the particular piece of data is not held in the second level cache, a request is made to a third level cache. If the particular piece of data is held in the third level cache, the cache line including the particular piece of data is fetched from the third level cache and stored in the second level cache and the first level cache and made available to the processor. If a cache miss occurs in the third level cache, a further request is made to a fourth level cache, if a fourth level cache exists, or to main memory.
Since a lower level of cache holds less data than a higher level of cache, a number of cache line positions in the higher level of cache map to fewer cache line positions in the lower level of cache. In modern computer systems, a cache is typically designed with associativity. Associativity means that a particular cache line maps to a particular set (row) in a cache, but replacement logic supporting the cache can place the particular cache line in any of a number of classes (cache line locations) in the set. A particular class in a particular set is a cache line position. For example, for a four-way associative second level cache, the replacement logic chooses into which of four classes to store a particular cache line that maps to a particular set.
When a cache line is written into a cache from a higher level cache, a cache line must be evicted (written to a higher level cache, or, if data in the cache line has not been modified, simply be written over).
In a cache with associativity, a replacement algorithm chooses which cache line in a set is replaced. For example, if a cache is eight-way associative, that is, has eight classes per set, one cache line out of eight must be evicted to make room for a new cache line that has an address that maps to the set.
A number of replacement algorithms have been implemented in various computer systems. Least Recently Used (LRU) algorithms have had wide usage, with the notion that a more recently used cache line is more likely to be needed again than a cache line that has not been as recently used. A problem with the LRU algorithm is that a particular cache line can appear to be not used for a relatively long period of time for two reasons. A first reason is that a processor no longer needs data in the particular cache line and has loaded another cache line, overwriting the particular cache line. A second reason is that a processor is frequently using data in the particular cache line and has not updated the higher level cache for some time. If the particular cache line appears to the higher level cache to be a candidate for eviction based on an LRU algorithm, but data in the particular cache line is being frequently used as explained for the second reason, inefficienges will occur when the higher level cache evicts the particular cache line, since eviction will also include eviction from the lower level cache. Since data in the particular cache line is being frequently used, the processor will simply have to request the cache line again and the processor will have to wait until the cache line is retrieved from a level higher than the higher level cache.
Because of the problems with LRU as explained above, many computer systems having multiple levels of cache have implemented a pseudo random eviction algorithm in the higher level cache, in effect, admitting that the higher level cache does not know which cache line in a set in the higher level cache is a preferred candidate for eviction, and just picks one cache line at random from the set for eviction. Unfortunately, the pseudo random eviction algorithm also often evicts cache lines that are being frequently used by the processor, again causing the processor to wait for evicted cache lines to be fetched from memory at a higher level than the higher level cache.
Therefore, there is a need for a method and apparatus that provides for an improved eviction scheme in a higher level cache.