Items found in cache memories can be retrieved faster than from main memories. Cache memories are fast memories that store items temporarily with the expectation that the entry might be requested again soon, thus saving the time an effort to bring the entry from main memory. However, cache memories are limited in capacity and therefore need to be managed efficiently so that useful information is retained, and stale and useless information is quickly discarded.
Many systems still use the old “static” technology of LRU or LFU. More modern adaptive approaches (e.g., ARC ad CAR) consistently outperform their non-adaptive competitors but were conceived prior to the current revolution in machine learning (ML). ML approaches have been attempted, but the performance of the resulting methods have not been competitive with the best of the current technology. Furthermore, these attempts at ML-based cache replacement are considerably less efficient since they have to simultaneously simulate multiple expensive cache replacement algorithms and keep track of the best expert among them at any given time.
Caches are limited memory storage devices and need a management algorithm that decides which items should be stored and which are to be discarded. This replacement scheme cannot be handled by a single policy and needs to be adaptive to the input. ML has the capability to learn and anticipate changes in the input distribution and, thus, ML has the ability to make the best decisions for cache replacement. Caches are in every conceivable device that has a computing unit and any winning cache replacement algorithm would have far-reaching impact and applications.
Higher hit rates in caches translate to faster memory accesses and faster computations. This problem has not seen a major improvement in over a decade. Small caches are particularly relevant in small devices (mobile, IoT) and could have an impact on the field.
The best-known strategies for cache replacement are LRU and CLOCK, both of which tend to retain pages with high recency, and LFU, which retains pages based on how frequently they have been referenced. These static strategies cannot adapt to changes in workloads and fail to have good all-round performance, especially when recent pages are not frequently accessed or when pages are accessed a number of times and then lapse into long periods of infrequent access.