Embodiments relate to cache accessing operations, and more specifically, to reducing penalties for cache accessing operations.
A cache of a central processing unit (CPU) is a cache used by the CPU of a computer to reduce an average time to access memory. The cache is a small, fast memory that stores copies of data from frequently used main memory locations. As long as most memory accesses are cached memory locations, an average latency of memory accesses will be closer to the cache latency than to the latency of main memory.
When the CPU needs to read from or write to a location in main memory, the CPU first checks whether a copy of that data is in the cache. If so, the CPU immediately reads from or writes to the cache, which is much faster than reading from or writing to main memory. Modern CPUs may have at least three independent caches: an instruction cache to speed up executable instruction fetch, a data cache to speed up data fetch and store operations and a translation lookaside buffer (TLB) used to speed up virtual-to-physical address translation for both executable instructions and data. The data cache is usually organized as a hierarchy of more cache levels.