There has been rapid advancement in design aspect of microprocessor in past few decades. Microprocessor designers keep on designing new microprocessors for enhancing performance, reducing access time, and in turn increasing efficiency of computer systems. One approach to improve access time to data may include several hierarchies of cache memory levels in the computer system. Usually, accessing data from cache memory is faster than accessing data from main memory of the computer system. Typically, cache memories are small blocks of high speed static RAM (SRAM), either on-chip with the microprocessor or off-chip (or both). Usually, the cache memory stores the content of memory locations that are likely to be accessed in the near future or more frequently. The cache memory can also include memory locations or one or more addresses that are near-neighbors to a recently accessed memory location. The memory subsystem can include multiple level of cache memory for achieving high instruction throughput rate.
Usually, microprocessors can access memory in sequential fashion. A lowest level of cache memory is usually smallest and fastest. It means accessing information from the lowest block is fastest. One implementation of these cache hierarchies is based on data duplication. Each level of the cache memory hierarchy can include the data stored in the next lower level of the cache memory. Lower level caches are smaller but can be accessed faster. Certain rules can be followed in order to keep the data coherent in all cache memory hierarchies. A fundamental rule used in many cache implementations is, not to fetch data from higher level caches that are still available in the lower level(s) of cache memory. Especially, this is true if the lower level cache hierarchies contain updates that have not yet propagated to the higher levels of cache memory. During execution of a program on the computer system, a processor may execute multiple processor instructions for referencing memory locations. While execution, the processor searches for required data at a memory location on a level 1 (L1) of cache memory. In case the data or referenced memory location is not available in the level 1 then a cache miss occurs. Then, the L1 cache can send a corresponding request to level 2 (L2) based on the L1 cache miss. Further, if the referenced memory location is also unavailable in L2 cache memory level, then additional memory request may be sent to higher memory levels. Therefore, these rules add to the latency that is observed when fetches from higher cache hierarchies have to be made.
Therefore, there exists a need for techniques to reduce latency while accessing data from cache memory.