Memory caches are storage systems incorporated into data processing systems for performance reasons. A memory cache stores a subset of the contents of the data processing system's main memory for use by a selected subsystem, typically the system's data processor. A memory cache can supply data to the data processor faster than the main memory can because of several reasons. First, the memory cache is often made of higher grade memory circuits than is the main memory system. These circuits can simply operate at a higher clock rate than can the main memory. Also, there may be an exclusive-use bus between the data processor and the memory cache that results in higher bandwidth between the data processor and the memory cache than between the data processor and the main memory. Finally, a memory cache may be physically located on the same integrated circuit as the subsystem to which it provides data. In this case, the memory cache is constructed from faster circuits and there is an exclusive-use bus between the memory cache and the data processor.
Associativity is one variable that defines memory cache designs. Associativity describes the number of memory cache locations to which each main memory subsystem location may be mapped. For instance, the contents of each main memory location may be mapped to one of four different locations in a four-way set associative memory cache. When the data processor requests the contents of a certain main memory location, the data processor compares the contents of a tag associated with each of the four possible storage locations to a portion of the address of the requested data. The tag is stored in a random access memory ("RAM") associated with each memory cache entry or "cache line." One or none of the tags will match the address portion depending upon the prior history of the data processor. If one of the tags matches, then the associated memory cache location contains the requested data, a cache "hit." In the case of a cache hit, the data processor can quickly access the cached data. If none of the tags match, then no memory cache location contains the requested data, a cache "miss." In the case of a cache miss, the data processor must access the main memory to retrieve the requested data.
Although memory caches were created to enhance performance, they detract from the performance of data processors in another way; power consumption. Generally, known memory caches waste power in two ways. First, the sense amplifier portion of known memory caches consume "crowbar" current. Second, known memory caches waste power by unnecessarily consuming switching current.
U.S. Pat. No. 4,804,871 entitled "Bit-line Isolated, CMOS Sense Amplifier" issued Feb. 14, 1989 describes a sense amplifier (16) which conveniently illustrates the consumption of crowbar current. FIG. 1 of the '871 patent depicts an exemplary column in one "way" of a memory cache. A signal (C3) couples the bitlines (B and B) to a sense amplifier when de-asserted and enables the sense amplifier when asserted. While the bitlines are coupled to the output nodes (X and Y), they force the voltage on the output nodes to slowly separate, reflecting the output data value. The intermediate voltage level on these output nodes will partially turn on both the P device and the N device in a downstream complimentary metal oxide semiconductor (CMOS) device (not shown). Consequently, the downstream CMOS device will create a conductive path between its two power supplies. The longer the separation time, the greater is the crowbar current.
FIG. 1 depicts a block diagram of a known memory cache 10. Memory cache 10 conveniently illustrates the unnecessary consumption of switching current in known memory caches. Memory cache 10 is a four-way set-associative cache containing four data arrays 12, 14, 16, and 18 and four corresponding tag arrays 20, 22, 24, and 26. Each of the four data arrays 12, 14, 16, and 18 and the four corresponding tag arrays 20, 22, 24, and 26 receive the least significant portion of an input memory address (labeled LSB ADDRESS). Each of the four data arrays 12, 14, 16, and 18 has a corresponding sense amplifier 28, 30, 32, and 34. The output of sense amplifiers 28, 30, 32, and 34 are connected to a 4:1 multiplexer (labeled 4:1 MUX) 36. The output of MUX 36 is selected by the four control signals HIT0, HIT1, HIT2, and HIT3. A first comparator 38 compares the tag output of tag array 20 (WAY0) and a most significant portion of the input address (labeled MSB ADDRESS). A second comparator 40 compares the tag output of tag array 22 (WAY1) and MSB ADDRESS. A third comparator 42 compares the tag output of tag array 24 (WAY2) and MSB ADDRESS. A fourth comparator 44 compares the tag output of tag array 26 (WAY3) and MSB ADDRESS.
To increase the rate at which MUX 36 outputs DATA, memory cache 10 accesses the data stored in the four data arrays 12, 14, 16, and 18 and the four tag arrays 20, 22, 24, and 26 in parallel. Each of the four data arrays 12, 14, 16, and 18 outputs a data line through the corresponding sense amplifier 28, 30, 32, and 34 to MUX 36. Similarly, the four tag arrays 20, 22, 24, and 26 output four tags to the four comparators 38, 40, 42, and 44. (The four tag arrays 20, 22, 24, and 26 may or may not contain four sense amplifiers depending upon the number of entries in each tag array.) One or none of the four comparators 38, 40, 42, and 44 asserts its hit signal if the corresponding tag matches the most significant portion of ADDRESS.
Memory cache 10 consumes an excessive amount of switching current because it enables all four sense amplifiers 28, 30, 32, and 34 each read operation when, at most, only one data array will output a cache line as DATA OUT. Here, switching current describes the power consumed by a CMOS circuit when it changes logic state from "0" to "1" or vice versa. In general, N-1 sense amplifiers will unnecessarily sense a data value and drive the data value out to MUX 36 each cycle, where N is the associativeness of the memory cache.