The use of a cache memory with a processor facilitates the reduction of memory access time. Hardware implements cache as a block of memory for temporary storage of data likely to be used again. Central processing units (CPUs) and hard disk drives (HDDs) frequently use a cache, as do web browsers and web servers. A cache is made up of a pool of entries. Each entry has associated data, which is a copy of the same data in a backing store. Each entry also has a tag, which specifies the identity of the data in the backing store of which the entry is a copy. When the cache client (a CPU, web browser, operating system) needs to access data presumed to exist in the backing store, the cache client first checks the cache.
The fundamental idea of cache organization is that by keeping the most frequently accessed instructions and data in the fast cache memory, the average memory access time will approach the access time of the cache. To achieve the maximum possible speed of operation, typical processors implement a cache hierarchy, that is, different levels of cache memory. The different levels of cache correspond to different distances from the processor core. The closer the cache is to the processor, the faster the data access. However, the faster the data access, the more costly it is to store data. As a result, the closer the cache level, the faster and smaller the cache.
The performance of cache memory is frequently measured in terms of its hit ratio. When the processor accesses memory and finds the requested data in cache, a cache hit is said to have occurred. If the requested data is not found in cache, then the data is in main memory and a cache miss has occurred. If a miss occurs, then an allocation is made at the entry indexed by the access. The access can be for loading data to the processor or storing data from the processor to memory. The cached information is retained by the cache memory until it is no longer needed, made invalid or replaced by other data, in which instances the cache entry is de-allocated.
The type of data that is typically stored in cache includes active portions of programs and data. Certain instructions, however, are used infrequently. Locality of reference is a term for the situation in which the same values, or related storage locations, are frequently accessed, depending on the memory access pattern. Temporal locality refers to the reuse of specific data, and/or resources, within a relatively-small time duration. Systems that exhibit strong temporal locality are candidates for performance optimization through the use of techniques such as caching. On the other hand, instructions that do not exhibit temporal locality provide no benefit by being written to cache. Since non-temporal instructions are used infrequently, optimal performance dictates that the cached application code and data not be overwritten by this infrequently used data.