On-chip caches are used in various microprocessor designs to improve performance by storing frequently used information in fast on-chip memories. The performance is improved because information can be retrieved quickly during program execution. Various types of cache architectures exist. A “write-back” cache, for example, allows modification of data to be carried out in the cache without immediately writing to the main memory reflect the same data modification. The memory is updated only when the data is eventually evicted from the cache, for example, to make room for new data in the cache. The write-back cache generally reduces bus traffic to main memory and generally delivers better performance and consumes less power than various other types of cache architectures. Reducing power consumption is particular beneficial in battery-operated devices such as cellular telephones.
In a write-back cache, the content of the cache may be more up-to-date than that of main system memory. The write-back cache maintains status bits to indicate whether the cache content has been modified. The status bits are referred to as “dirty” bits.
Caches are organized as a plurality of cache “lines” (also referred to as “blocks”). A cache line may comprise, for example, 64 bytes but the architectural size of a cache is determined by the cache architecture. Typically, there is one dirty bit per cache line. The dirty bit is set to a value of “1” when any byte within the cache line is modified. When the dirty bit is set to a 1, the entire cache line must be written back to the main memory when the data is eventually evicted from the cache, for example, to make room for new data. This architecture can be relatively inefficient in terms of both performance and power particularly as the cache line size becomes large because more data must be written back even though only a small number of bytes may have been modified. To overcome this problem, some write-back cache architectures partition the cache line into sub-lines. Each sub-line can be separately written back to system memory without having to write back the entire cache line.
A typical cache design consists of 3 functional blocks: the cache controller, the tag array, and the data array. To service a read request, the controller issues a read command to the tag array to look up the address to determine if there is a cache hit (i.e., whether the target data already resides in the cache). If there is a cache hit, the controller then issues a read command to the data array to retrieve the target data. If the request misses in the cache, the controller forwards the request to the next level of memory hierarchy (e.g. system memory) to read the cache line and load it into the cache. To make room in the cache for the new data, the controller must select a cache line in the cache for eviction. If the cache line being evicted has been modified (as indicated by a dirty bit that is set), the controller must write back the modified cache line (or sub-line) into the main memory. Otherwise, the controller simply overwrites the cache line with the new data.
To service a write request, the controller issues a read command to the tag array to look up the address to determine if there is a cache hit. If there is a cache hit, the controller issues a write command to the data array to update the data array with the new write data. When servicing a write request, the dirty bit is set to indicate that the cache data has been modified. Caches that implement sub-lines complicate the issue of how to implement and control dirty bits to track the state (clean or dirty) for each sub-line.