In computing systems, a cache is a hardware or software component that stores data so future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation, or the duplicate of data stored elsewhere. A cache hit occurs when the requested data can be found in a cache, while a cache miss occurs when it cannot. Cache hits are served by reading data from the cache, which is faster than recomputing a result or reading from a slower data storage device such as a main memory; thus, the more requests can be served from the cache, the faster the computing system performs.
Computing hardware typically implements a cache as a block of memory for temporary storage of data likely to be used again. A cache is often part of a processor die or included in a data storage device to enable fast access to the data in the cache.
A cache is made up of a pool of entries, also called cache lines. Each entry has associated data, which is a copy of the same data in some other storage device. Each entry also has a tag, which specifies the identity of the data in the storage device of which the entry is a copy.
When the cache client (such as a processor) needs to access data presumed to exist in the storage device, the cache client first checks the cache. If a cache entry can be found with a tag matching that of the desired data, the data in the cache entry is used instead. This situation is known as a cache hit. The alternative situation, when the cache is consulted and found not to contain data with the desired tag, is known as a cache miss. The previously un-cached data fetched from the storage device during cache miss handling is usually copied into the cache, to be ready for the next access.
During a cache miss, the processor usually ejects some other entry in order to make room for the previously un-cached data. The algorithm used to select the entry to eject is known as the replacement policy. One popular replacement policy, the “least recently used” (LRU) eviction policy, replaces the least recently used entry with the newly fetched data.
When a system writes data to cache, it must at some point write that data to the storage device as well. The timing of this write is controlled by what is known as the write policy.
There are two basic writing approaches: 1) Write-through: write is done synchronously both to the cache and to the storage device; and 2) Write-back: initially, writing is done only to the cache. The write to the storage device is postponed until the cache blocks containing the data are about to be modified/replaced by new content.
A write-back cache is more complex to implement, since it needs to track which of its locations have been written over, and mark them as dirty for later writing to the storage device. The data in these locations are written back to the storage device when they are evicted from the cache, an effect referred to as a lazy write. For this reason, a read miss in a write-back cache (which requires a block to be replaced by another) will often require two memory accesses to service: one to write the replaced data from the cache back to the storage device (in order to synchronize dirty portions), and then one to retrieve the needed data from main storage.