Cache memories are used to accelerate access to data on slow storage by managing a subset of the data in smaller, faster, and, typically, more expensive storage. Caches come in many shapes and forms, and can be embodied in hardware, such as central processing unit (CPU) caches, and software, such as Memcached. They can also be layered across several storage layers.
Caches can improve performance when the accessed data is not uniformly random, but is instead distributed based on locality properties, for example, spatial locality and temporal locality. With data having spatial locality, if a user accesses one datum, it is likely that the user or another user will access other data that are close or similar to it. With data having temporal locality, if a user accesses one datum, it is likely that the user or another user will access the same datum again soon. Locality gives a measure of predictability to data access patterns, so it is advantageous to store in the smaller cache only those items that are predicted to be recalled soon. The ratio of all data accesses that can be served by the cache is called the hit ratio. The hit ratio is one of the main metrics of the successful implementation of a cache.
Improving a cache's hit ratio can have large economical impact. There have been many efforts to design better algorithms to predict which items to store in the cache and which to evict to make room for more items more likely to be requested. One of the most commonly used and successful cache-replacement policies is called Least-Recently-Used (LRU) algorithm.
With the LRU algorithm, all items in the cache are logically arranged in a sorted queue, with the most recently accessed datum at the head of the queue, and the least recently accessed datum at the tail. Whenever an item is accessed, it is promoted to the head of the queue, pushing all other items in the queue one position down. Since the cache size, and correspondingly, queue length, is finite, every time a new item is inserted in a full cache, one item must be evicted from the cache to make room for the new item. With the LRU algorithm, that datum is the tail of the queue, or the least-recently accessed datum.