In computer engineering, a cache is a block of memory used for temporary storage of frequently accessed data so that future requests for that data can be more quickly serviced. As opposed to a buffer, which is managed explicitly by a client, a cache stores data transparently; thus, a client requesting data from a system is generally not aware that the cache exists. The data that is stored within a cache might be comprised of results of earlier computations or duplicates of original values that are stored elsewhere. Data cache is used to manage core accesses to the data information.
If requested data is contained in the cache, often referred to as a cache hit, this request can be served by simply reading the cache, which is comparably faster than accessing the data from main memory. Conversely, if the requested data is not contained in the cache, often referred to as a cache miss, the data is recomputed or fetched from its original storage location, which is comparably slower. Hence, the more requests that can be serviced from the cache, the faster the overall system performance. In this manner, caches are generally used to improve processor core (core) performance in systems where the data accessed by the core is located in comparatively slow and/or distant memory (e.g., double data rate (DDR) memory).
Since a cache is typically much smaller compared to main memory (for a number of reasons including, but not limited to, cost, system complexity, size, power consumption, etc.), data stored in the cache may need to be replaced by data used in a more recent calculation. There are various known cache algorithms, also referred to as cache replacement algorithms or cache replacement policies, designed to manage the information stored in the cache, such as, for example, least recently used (LRU), most recently used (MRU), random replacement, etc. Cache algorithms are essentially a set of optimizing instructions that a computer program or a hardware-maintained structure implements for managing a cache of information stored on the computer. When the cache is full, the cache algorithm selects which information in the cache to discard in order to make room for the newly requested information.
The hit rate of a given cache describes how often a requested data item is actually found in the cache. The latency of a cache describes how long after requesting a desired item the cache returns that item (when there is a cache hit); generally, it is desirable to keep the hit rate of the cache high while maintaining a low latency. Each cache replacement strategy represents a compromise between hit rate and latency, a ratio of hit rate and latency often being used as a cache performance indicator.