1. Field of the Invention
The present invention relates to the field of caches. More particularly, the present invention relates to circuitry and a corresponding method for computing cache indexes of cache memory in order to avoid conflict misses.
2. Description of Related Art
It is well-known that the performance of a microprocessor system is influenced by the access time of system memory. Although the speed of semiconductor memories has improved over the last few years, the speed of main memory, comprising dynamic random access memory ("DRAM") devices, has not kept pace with the speed of processors. Consequently, when executing most applications, a processor would experience a number of "wait" states while main memory is being accessed to complete a read or write instruction.
In order to reduce the number of "wait" states, one typical solution is to employ a cache, including high-speed memory such as static random access memory ("SRAM") devices, within the microprocessor system. The cache may be implemented as a first level cache ("L1 cache"), a second level cache ("L2 cache") or any other well-known implementation. Substantially lesser in size (i.e., number of bytes) than main memory, cache is designed to improve overall system performance by providing faster access rates for retrieving data predicted to be the most commonly used.
It is well-known that a large percentage of caches perform mapping in accordance with a well-known indexing technique called "Bit-Selection". Depending on whether the cache is a "physical cache" or a "virtual cache", Bit-Selection produces a "set" address, otherwise known as a cache index, for indexing the cache memory by masking a selected number of bits of a physical address or by translating a virtual address, respectively. After being calculated, the set address is used to directly access a cache line of cache memory corresponding to the set address.
To better illustrate Bit-Selection, an example may be appropriate. In a microprocessor system providing 32-bit address lines "A[31:0]", 64 megabytes (MBytes) of main memory divided into 32-byte blocks and a physical cache including 1 MByte of cache memory divided into 32-byte cache lines, 27-bits of the 32-bit virtual address are translated to 21-bits of the physical address. Of these 21-bits of the physical address, 6-bits would be masked to provide a 15-bit set address for uniquely addressing each cache line of the cache. Preferably, the masking is performed on the most significant bits which, for this example, are bits A26-A20 of the physical address.
Although the Bit-Selection technique provides fast indexing of the cache memory due to its simplicity, it affords a number of disadvantages. One disadvantage associated with the Bit-Selection technique is that it is quite susceptible to conflict misses. One case is when a software application program is running multiple sub-routines in succession and these sub-routines are accessing similar parts of different pages in memory. If the program has more pages than are capable of being concurrently stored within the cache, there exists an increased likelihood of a conflict miss.
It is contemplated that majority of conflict misses arise from successive memory accesses separated by a common "stride". A "stride" is a measure of distance in bytes, words or any other bit representation between successive memory accesses. Normally, the stride remains as a power of two and generally constant. Thus, for a cache of size "Q", if the stride "R" possesses a large common co-factor (i.e., greatest common divisor) that is a large fraction of the cache size, a conflict miss causing eviction of the cache line would likely occur every "Q/R" memory accesses. For example, for a 1 MByte cache with a cache line of 32 bytes and a constant stride of 8 KBytes, the cache line would be evicted every 128 memory accesses. Another example, if the stride is equal to a multiple of the cache size, a conflict miss is guaranteed to occur every successive memory access after the first memory access.
Another disadvantage is that the Bit-Selection technique does not encourage uniformity in accessing each cache lines of cache memory with similar regularity. Such a non-uniform accessing technique increases the overall rate of cache collisions thereby reducing system performance the cache memory having to be reloaded more often. As a result, it would be advantageous to access each cache line with greater uniformity.