The present invention relates generally to computers, and more specifically to reducing power consumption in a cache memory system.
A microprocessor can execute instructions at a very high rate, and it must be connected to a memory system. The memory system is ideally both large and fast, but it is practically impossible to design such a system. A composite memory system is designed such that it has both a small and fast cache memory and large but slow main memory components. For example, the access time of a cache may be around ten nanoseconds, while that of the main memory is around 100 nanoseconds.
A cache memory (or simply “cache”)is a relatively small and fast storage system incorporated either inside or close to a processor or between a processor and a main system memory. A cache memory stores instructions or data, which can be quickly supplied to the processor. The effectiveness of the cache is largely determined by the spatial locality and temporal locality properties of the program involved. Data from the much larger but slower main memory is automatically staged into the cache by special hardware on demand, typically in units of transfer called “lines” (ranging, for example, from 32 to 256 bytes).
When a memory read operation is requested by the processor, the cache memory is checked to determine whether or not the data is present in the cache memory. If the cache contains the referenced data, the cache provides the data to the processor. Otherwise, the data is further accessed from the main memory. As such, the cache can store frequently accessed information and improves the processors performance by delivering the needed information faster than the main memory can. In a typical design, a cache memory uses data arrays to store data and tag arrays to store the tag addresses corresponding to the data.
A main memory address may consist of a tag field and an index field. The index field is used to index a specific tag address stored in a cache tag array. When a cache memory access is performed, the tag address stored in the cache tag array is read and it is then compared to the tag field of the main memory address. If the two tag addresses match, a cache “hit” has occurred and the corresponding data is read out from the cache to the processor. If the two tag addresses do not match, a cache “miss” has occurred and the requested data is not in the cache, and must be retrieved from other components such as the main memory. If a program running on the computer exhibits good locality of reference, most of the accesses by the processor are satisfied from the cache, and the average memory access time seen by the processor will be very close to that of the cache (e.g., on the order of one to two cycles). Only when the processor does not find the required data in the cache does it incur the “cache miss penalty”, which is the longer latency to the main memory (e.g., on the order of twenty to forty cycles in computers with short cycle times).
Further, in the conventional art, there are multiple tag arrays and data arrays in the cache. They are usually accessed simultaneously so that it is optimal for the operation speed, although it is not the best consideration for the power consumption as all of the large data arrays must be read before the desired data is retrieved. This incurs a relatively large power consumption and is detrimental for low power applications.
What is needed is an improved method and system for selectively accessing the data arrays so that the total power consumption is reduced.