1. Field of the Invention
The present invention relates to the design of cache memories within computer systems. More specifically, the present invention relates to a method and apparatus for mapping memory addresses to corresponding cache entries.
2. Related Art
As processor clock speeds continue to increase at an exponential rate, computer system designers are under increasing pressure to achieve shorter propagation delays through computational circuitry. Unfortunately, computational performance depends on not only processor speed, but also other time-consuming operations. In particular, memory access delays can generate a severe bottleneck as processor speeds reach beyond multiple gigabits per second.
A common technique to alleviate the memory access bottleneck is to store frequently accessed data items in a high-speed cache memory. However, a cache memory cannot be made infinitely large because of chip area and timing constraints. Hence, a computer system's main memory typically needs to be mapped into a much smaller cache memory. When the requested data is found in the cache, a cache hit occurs and much time is saved. Otherwise, a cache miss occurs and the processor stalls to wait for a memory access.
In designing a cache memory, one challenge is to minimize cache misses and to achieve a more uniform cache-access pattern. Cache misses largely depend on application's memory access patterns and on how memory is mapped to the cache. Conventional cache mapping schemes use a portion of the memory address referred to as the “index bits” to directly index cache entries. Hence, memory locations with addresses containing the same index bits are always mapped to the same cache entry (or set of cache entries).
Existing cache mapping schemes are far from perfect. It has been observed that frequently accessed data items are often mapped to the same cache entries, which causes frequent cache line evictions and cache misses. One example is frequently accessed page headers that reside in the first bytes of each page, and which typically get mapped into the same cache entries. Another example is repeating data structures used by trees, hashes, and linked lists, which also tend to be mapped to the same cache entries. As a result, some cache entries suffer more misses than others do. To mitigate the non-uniform distribution of cache misses, a variety of solutions have been attempted: larger block sizes, higher associativity, victim caches, prefetching (hardware and software), and compiler optimizations. However, these solutions each have trade-offs in increased cost and complexity. Great ingenuity, expense, and effort have gone into solving the non-uniform cache miss problem; nevertheless, the results have so far been unsatisfactory.
Hence, what is needed is a method and an apparatus for mapping memory addresses to cache entries in a manner that minimizes cache misses and generates a more uniform distribution of cache-misses.