1. Field of the Invention
This invention is related to caches and, more particularly, to evicting blocks of data from caches.
2. Description of the Related Art
Generally, caches are used to reduce the effective latency of memory accesses. A cache is a memory into which copies of data from an underlying memory are stored. Generally, a block of contiguous data is allocated/deallocated from the cache as a unit (i.e. a cache block is the smallest unit of allocation/deallocation of storage space in the cache). The term cache line is also frequently used as a synonym for cache block. The cache typically has a latency less than that of the underlying memory, and thus accesses for which the corresponding data is stored in the cache may occur with a lower latency than accesses to the underlying memory. Thus, the average latency of memory accesses may be less than the latency of the underlying memory.
Caches attempt to store the most recently accessed blocks and/or the most frequently accessed blocks. In some cases, prefetch strategies are employed to speculatively load blocks which may be accessed in the future into the cache. However, since caches are usually significantly smaller in capacity than the underlying memory, data for an access may not be stored in the cache when the access occurs (referred to as a cache miss, or simply a miss). When a cache miss occurs, the missing cache block is generally loaded into the cache. Since the cache has a finite capacity, in many cases a valid cache block in the cache is replaced by the newly loaded cache block. If the cache block being replaced (referred to as the evicted cache block or the victim cache block) is modified with respect to the copy stored in memory, the evicted cache block is read from the memory before replacement by the newly loaded cache block. The evicted cache block may then be written back to memory.
Unfortunately, the hardware for reading the evicted cache block from the cache for writing back to memory may impact the amount of time for performing accesses. Typically, such hardware must be integrated into the hardware for performing cache accesses. Since cache accesses are often the critical timing path in a semiconductor device, increasing the path length may negatively impact the overall operating frequency of the device. Alternatively, the critical path may have to be pipelined, which may reduce the performance of the device.