In computing, a cache is generally a hardware component that stores data so future requests for that data can be served faster. The data stored in a cache might be the result of an earlier computation, and/or the duplicate of data stored elsewhere. A cache hit occurs when requested data can be found in a cache, while a cache miss occurs when that requested data cannot be found in the cache. Cache hits are served by reading data from the cache, which is faster than reading the data from a slower data store or memory. The more requests that can be served from the cache, the faster the system generally performs.
To be cost-effective and to enable efficient use of data, caches are relatively small. Nevertheless, caches have proven themselves in many areas of computing because typical computer applications tend to access data in recognizable patterns. These patterns typically exhibit a locality of reference (i.e. data requested in the future tends to be similar in some way to previously requested data). Some access patterns exhibit temporal locality, i.e. data may be requested again if it has been recently requested already. Other patterns exhibit spatial locality, which refers to requests for data that is physically stored close to data that has been already requested. Other forms of locality exist.
Generally a cache line or block is a basic unit of cache storage and may include multiple bytes and/or words of data. A cache set is more akin to a row in the cache, and generally includes a number of rows as determined by the design of the cache (e.g., direct mapped, set associative, fully associative).
Typically, due to the small size of the cache, one piece of data must be removed in order to put a new piece of data in. Often, the cache replaces lines based on age (e.g., most recently used (MRU) to least recently used (LRU)). A number of other cache replacement policies may be employed. Static cache replacement policies include LRU that predicts temporal locality and are not resistant to thrashing; LRU insertion policy (LIP) that assumes no temporal locality and does not adapt to changes in working set; bi-modal that varies insertion position using static probabilities; re-reference prediction (RRIP) that filters temporal data from other non-temporal (or dead) lines and are not resistant to thrashing. However, each policy has advantages and disadvantages and no one policy is optimal for each situation.