Caches are commonly used to temporarily store values that might be repeatedly accessed by a processor, in order to speed up processing by avoiding the longer step of loading the values from main memory such as random access memory (RAM).
An exemplary cache line (block) includes an address-tag field, a state-bit field, an inclusivity-bit field, and a value field for storing the actual instruction or data. The state-bit field and inclusivity-bit field are used to maintain cache coherency in a multiprocessor computer system. The address tag is a subset of the full address of the corresponding memory block. A compare match of an incoming effective address with one of the tags within the address-tag field indicates a cache “hit.” The collection of all of the address tags in a cache (and sometimes the state-bit and inclusivity-bit fields) is referred to as a directory, and the collection of all of the value fields is the cache entry array.
When all of the blocks in a set for a given cache are full and that cache receives a request, with a different tag address, whether a “read” or “write,” to a memory location that maps into the full set, the cache must “evict” one of the blocks currently in the set. The cache chooses a block to be evicted by one of a number of means known to those skilled in the art (least recently used (LRU), random, pseudo-LRU, etc.).
A general-purpose cache receives memory requests from various entities including input/output (I/O) devices, a central processing unit (CPU), graphics processors and similar devices. The CPU initiates the heaviest access to and from system memory (and inherently the cache). Thus, the CPU's requests have relatively high bandwidth requirements and are often latency sensitive. Graphics processor requests may occur infrequently compared to CPU request, however graphics request may be equally sensitive to latency. General-purpose caches fail to account for these inherent differences in requests.