Cache memories have been used to improve processor performance, while maintaining reasonable system costs. A cache memory is a very fast buffer comprising an array of local storage cells used by one or more processors to hold frequently requested copies of data. A typical cache memory system comprises a hierarchy of memory structures, which usually includes a local (L1), on-chip cache that represents the first level in the hierarchy. A secondary (L2) cache is often associated with the processor for providing an intermediate level of cache memory between the processor and main memory. Main memory, also commonly referred to as system or bulk memory, lies at the bottom (i.e., slowest, largest) level of the memory hierarchy.
In a conventional computer system, a processor is coupled to a system bus that provides access to main memory. An additional backside bus may be utilized to couple the processor to a L2 cache memory. Other system architectures may couple the L2 cache memory to the system bus via its own dedicated bus. Most often, L2 cache memory comprises a static random access memory (SRAM) that includes a data array, a cache directory, and cache management logic. The cache directory usually includes a tag array, tag status bits, and least recently used (LRU) bits. (Each directory entry is called a “tag”). The tag RAM contains the main memory addresses of code and data stored in the data RAM plus additional status bits used by the cache management logic.
Cache sizes are growing, especially lower level (L2 and L3) caches in multi-level cache systems. Large on-die cache memories are typically subdivided into multiple cache memory banks, which are then coupled to a wide (e.g., 32 bytes, 256 bits wide) data bus. Multi-banked caches are often arranged according to a Non Uniform Cache Architecture, especially caches of multi-core chips, also known as Chip-Multi-Processors (CMPs).
As cache sizes continue growing, power consumption is becoming more problematic. Sometimes, if a workload for a system is reduced such that the workload does not require all of the processors or cores of the chip, the system may turn off one or more of the cores, which may allow for power savings by turning off one or more cache memory banks. Current cache memories may employ replacement-policy schemes, such as a pseudo least recently used (LRU) scheme. The replacement-policy bits for the relatively large cache sizes consume considerable amounts of power, in addition to the data storage elements of the caches.