1. Field of the Invention
The invention relates to processor caches and more particularly to determining which data in a cache to overwrite or write back in the event of a cache miss.
2. Discussion of Related Art
Processors have attained widespread use throughout many industries. A goal of any processor is to process information quickly. One technique which is used to increase the speed with which the processor processes information is to provide the processor with an architecture which includes a fast local memory called a cache. Another technique which is used to increase the speed with which the processor processes information is to provide a processor architecture with multiple processing units.
A cache is used by the processor to temporarily store instructions and data. A cache which stores both instructions and data is referred to as a unified cache; a cache which stores only instructions is an instruction cache, and a cache which stores only data is a data cache. Providing a processor architecture with either a unified cache or an instruction cache and a data cache is a matter of design choice.
A factor in the performance of the processor is the probability that a processor-requested data item is already in the cache. When a processor attempts to access an item of information, it is either present in the cache or not. If present, a cache "hit" occurs. If the item is not in the cache when requested by the processor, a cache "miss" occurs. It is desirable when designing a cache system to achieve a high cache hit rate, or "hit ratio".
After a cache miss occurs, the information requested by the processor must then be retrieved from memory and brought into the cache so that it may be accessed by the processor. A search for an item of information that is not stored in the cache after a cache miss usually results in an expensive and time-consuming effort to retrieve the item of information from the main memory of the system. To maximize the number of cache hits, data that is likely to be referenced in the near future operation of the processor is stored in the cache. Two common strategies for maximizing cache hits are storing the most recently referenced data and storing the most commonly referenced data.
In most existing systems, a cache is subdivided into sets of cache line slots. When each set contains only one line, then each main memory line can only be stored in one specific line slot in the cache. This is called direct mapping. In contrast, each set in most modern processors contains a number of lines. Because each set contains several lines, a main memory line mapped to a given set may be stored in any of the lines, or "ways", in the set.
When a cache miss occurs, the line of memory containing the missing item is loaded into the cache, replacing another cache line. This process is called cache replacement. In a direct mapping system, each line from main memory is restricted to be placed in a single line slot in the cache. This direct mapping approach simplifies the cache replacement process, but tends to limit the hit ratio due to the lack of flexibility with line mapping. In contrast, flexibility of line mapping, and therefore a higher hit ratio, can be achieved by increasing the level of associativity. Increased associativity means that the number of lines per set is increased so that each line in main memory can be placed in any of the line slots ("ways") within the set. During cache replacement, one of the lines in the set must be replaced. The method for deciding which line in the set is to be replaced after a cache miss is called a cache replacement policy.
Several conventional cache replacement policies for selecting a datum in the cache to overwrite include Random, Least-Recently Used (LRU), Pseudo-LRU, and Not-Most-Recently-Used (NMRU). Random is the simplest cache replacement policy to implement, since the line to be replaced in the set is chosen at random. The LRU method is more complex, as it requires a logic circuit to keep track of actual access of each line in the set by the processor. According to the LRU algorithm, if a line has not been accessed recently, chances are that it will not be accessed any more, and therefore it is a good candidate for replacement. Another replacement policy, NMRU, keeps track of the most recently accessed line. This most recently accessed line is not chosen for replacement, since the principle of spatial locality says that there is a high probability that, once an information item is accessed, other nearby items in the same line will be accessed in the near future. The NMRU method requires a logic circuit to keep track of the most recently accessed line within a set. In all cache replacement policies, the line selected for replacement may be referred to as a "candidate".
Once a candidate is selected, further processing must occur in the cache in order to ensure the preservation of memory coherency. If the contents of the candidate has been altered in the cache since it was retrieved from memory, then the candidate is "dirty" and a memory incoherency exists. Before the contents of the dirty candidate can be replaced with the new information requested by the processor, the current contents of the dirty candidate must be updated to memory. This operation is called a "write back" operation. While the implementation of such a scheme allows reduced bus traffic because multiple changes to a cache line need be loaded into memory only when the cache line is about to be replaced, a drawback to the write back operation is delay. That is, access to the cache is slowed or even halted during a write back operation.