High-performance cache memories are used widely in computer systems to couple high-speed processors to slower memory systems. Cache memories typically serve as high-speed buffers which hold a subset of the data from the computer system memories that are temporarily required by the processors. High-performance cache memories dissipate significant dynamic energy due to charging and discharging of highly capacitive bit lines and sense amplifiers. As a result, caches account for a significant portion of the overall power consumption in an integrated circuit (IC) device employing such caches.
To achieve low miss rates for running typical applications, modem processors often employ set-associative caches rather than direct-mapped caches. In contrast to direct-mapped caches, set-associative cache implementations provide more than one location to temporarily store data from the system memory. While more flexible placement of data within the set-associative cache generally results in lower miss rates and improved system performance, it also increases the number of potential locations that must be searched in order to locate the requested data. Consequently, since the number of sense amplifiers that are enabled at any given time is increased, the overall power consumption of the IC device is increased accordingly.
Many set-associative cache implementations achieve low latency by probing all of the data ways concurrently with the tag lookup. Since the output of only one of the ways, namely, the matching way, is ultimately used, energy spent accessing the other way(s) is wasted. Eliminating the wasted energy by retrieving the data after the tag lookup substantially increases cache latency and is therefore an unacceptable approach for many high-performance cache implementations.
Another approach disclosed in U.S. Pat. No. 5,848,428 to Collins reduces power consumption of the concurrent lookup of the set-associative cache by enabling only those sense amplifiers associated with the matching data way. The other sense amplifiers in the data array corresponding to non-matching (i.e., missed) ways are disabled and hence consume essentially no additional power. In this manner, a partial energy savings is realized in the data array. However, using the cache scheme disclosed by Collins undesirably increases cache latency for many implementations since the tag lookup must first determine the matching way before the sense amplifiers of the data array can be enabled. Thus, instead of propagating the requested data forward (e.g., to a multiplexer associated with the way selection), the data undesirably stalls at the sense amplifier stage.
There exists a need, therefore, in the field of memory systems for an architecture for implementing a memory cache which provides a flexible tradeoff between power consumption and cache latency in the memory cache, depending on the desired application in which the memory cache is employed.