The present invention relates generally to computers, and more specifically to reducing power consumption in a cache memory system.
A microprocessor can execute instructions at a very high rate, and it must be connected to a memory system. The memory system is ideally both large and fast, but it is practically impossible to design and make such a system. A composite memory system is designed such that it has both a small and fast cache memory and a large but slow main memory components. In some examples, the access time of a cache may be around ten nanoseconds, while that of the main memory is around 100 nanoseconds.
A cache memory (or simply “cache”) is a relatively small and fast storage system incorporated either inside or close to a processor or between a processor and a main system memory. A cache memory stores instructions or data, which can be quickly supplied to the processor. The effectiveness of the cache is largely determined by the spatial locality and temporal locality properties of a program involved. Data from the much larger but slower main memory is automatically staged into the cache by special hardware on a demand basis, typically in units of transfer called “lines” or cachelines (ranging, for example, from 32 to 256 bytes).
When a memory read operation is requested by the processor, the cache memory is checked to determine whether or not the data is present in the cache memory. If the cache contains the referenced data, the cache provides the data to the processor. Otherwise, the data is further accessed from the main memory. As such, the cache can store frequently accessed information and improves the processor performance by delivering the needed information faster than the main memory can. In a typical design, a cache memory uses a data array to store data and a tag array to store the tag addresses corresponding to the data.
A main memory address may consist of a tag field and an index field. The index field is used to index a specific tag address stored in the cache tag array. When a cache memory access is performed, the tag address stored in the cache tag array is read and it is then compared to the tag field of the main memory address. If the two tag addresses match, a cache “hit” has occurred and the corresponding data is read out from the cache to the processor. If the two tag addresses do not match, a cache “miss” has occurred and the requested data is not in the cache, and must be retrieved from other components such as the main memory. If a program running on the computer exhibits good locality of reference, most of the accesses by the processor are satisfied from the cache, and the average memory access time seen by the processor will be very close to that of the cache (e.g., on the order of one to two cycles). Only when the processor does not find the required data in the cache does it incur the “cache miss penalty”, which is the longer latency to the main memory.
A typical pipelined cache is a cache, which first performs a cache tag look-up and compare the tag addresses in the tag array and the memory address, and then accesses the appropriate cache data array when the cache tag compare indicates that a hit has taken place. In utilizing the cache memory in modern computer systems, it is highly likely that multiple accesses to the same cache line may be made as the data is retrieved from the cache in a sequential manner. For example, when a processor is doing an instruction fetch, the next instruction to be fetched is often at the next incremental address.
What is needed is an improved method and system for accessing the cache memory so as to reduce the number of cache tag array read operation, thereby reducing the total power consumption.