1. Field of the Invention
The invention relates to cache memory systems used in computer systems, and more particularly to the replacement of items when new items must be added to the cache memory system.
2. Description of the Related Art
Personal computers are becoming more powerful with each passing moment, or so it seems. The performance of the systems is great, but further performance is always being demanded. To this end, ever faster components are being used in the computer systems. The development of the key component of the computer system, the microprocessor, has outpaced the development of memory devices designed to work with the microprocessor. The cycle times of the microprocessor are quite low, so only very fast memory devices can be used or the microprocessor operations have to be slowed down, thus decreasing system performance. However, the memory devices capable of operating at the required speeds are relatively small and are expensive. Thus it is generally cost prohibitive to construct the entire main memory of the computer system using these fast memory devices. Thus performance must suffer because of economics.
One approach to resolve this conflict has been the use of cache memory systems. In a cache memory systems a small amount of the fast memory is used in conjunction with a large amount of slower memory. The slower memory forms the main system memory, while the small, fast memory contains portions of the data in the slower main memory. The cache memory generally contains recently used data, on the hope, which is statistically based, that the data will be reused soon. Then the data is available directly from the fast cache memory, without the delay penalty developed when accessing the slower main memory.
However, the cache memory is much smaller than the main memory and so some replacement policy is necessary. Some data must be removed from the cache to allow new data to be stored. The most widely preferred technique is the least recently used (LRU) technique. In that approach the least recently used of a series of locations is overwritten, thus keeping the newer data available for use. While this is a desirable goal, in practice it is quite difficult to implement in certain cases. Depending on the number of ways in a set associative cache design the number of bits of memory required to perform a true LRU is quite high. Sufficient information must be kept to keep track of the LRU way for each set in the cache. Additionally, the total time to develop the LRU information must not cause a delay in any cycle or either performance will suffer or costs will increase.
To resolve some of these problems pseudo-LRU techniques have been developed. One example of a psuedo-LRU technique is the Intel Corporation i486 microprocessor, which uses a 4 way set associative cache architecture. Three bits are provided to determine first, which half of the ways was least recently used and then second, which of the two ways in the half was least recently used. This is a pseudo-LRU technique because it does not account for properly reshuffling the order based on read hits to a particular way. It is possible for the least recently used way in a first half to remain unused for a longer period than both the ways in the second half if the most recently used way in the first half is continually the basis of an intervening read hit. Thus relatively stale data could be present, degrading cache system performance.
The major reason for employing pseudo-LRU techniques is simplicity of the logic and smaller amount of memory required for the LRU status information. The designer must make a trade off between the performance loss and the system complexity, and so many times pseudo-LRU techniques are used. However, the pseudo-LRU techniques become much more suboptimal as the total cache size gets smaller and the number of ways increases. Thus true LRU techniques become more important or major performance losses can occur.