In a computer, a cache is a small, fast memory separate from a processor's main memory that holds recently accessed data. Using the cache speeds up the access time for subsequent requests of the same data. A “cache hit” occurs when the requested data or memory location is found in the cache that is being searched. When the requested data or memory location is not found, it is considered a “cache miss,” and that data is likely allocated a new entry in the cache. If the cache is already full, one of many strategies may be employed to evict an existing entry.
A cache may include one or more tag storages and one or more data storages. A tag storage contains tags. Generically, a tag may be used to uniquely identify a cached piece of data and determine whether that cached data can be used to satisfy an incoming request. In one implementation, a tag may include an index of the main memory location of the cached data. In another implementation, in translation lookaside buffer (TLB) type caches, the tags may not directly index main memory locations, but may consist of virtual addresses and other request-based information that is not directly related to a specific main memory address. A data storage contains copies of the data from the main memory. The data storage may also contain data generated by the processor that has not yet been written out to the main memory (for example, with a write-back cache). Such data must be written out to memory before power can be removed from the cache.
To reduce power consumption in a computer, components (including internally integrated components) may be placed in low power states or completely powered off during idle periods. Powering off cache memories built with volatile storage elements results in a loss of state. Once power is restored, normal cache accesses will miss since the cache is empty, requiring data to be fetched from higher latency (and possibly lower bandwidth) persistent backing storage, resulting in lower performance. These accesses progressively refill the cache and, assuming that subsequent accesses start hitting these refilled entries, performance progressively recovers back to its nominal level.
Some existing techniques allow a cache to simultaneously power off a fraction of its contents, for example, both the tag storage and data storage. In one technique, power consumption may be reduced while maintaining state by providing power to only a portion of the cache. This particular solution consumes power in maintaining the state of some of the tag memories and corresponding data memories.
Other techniques reduce the cache's clock frequency to lower dynamic power consumption. Static power consumption may be reduced by a corresponding adjustment to the operating parameters of the cache's transistors (voltage reduction, biasing change, etc.).
The existing techniques result in a choice between sacrificing performance after power restoration to reduce power consumption or sacrificing optimal power consumption to retain power in a subset of the data storage to avoid the post-power restoration performance drop. Thus, there exists a need to power off most or all of a cache without sacrificing performance after the cache is powered on.