Typically, cache is memory that a processor may access more quickly than random access memory (RAM) on a main memory chip. Cache may be identified based on how close and accessible a memory is to the processor. For example, a first-level unified (L1) cache may reside on the same chip as the processor. When the processor executes an instruction, for example, the processor first looks at its on-chip cache to find the data associated with that instruction to avoid performing a more time-consuming search for the data elsewhere (e.g., off-chip or on a RAM on a main memory chip).
Caches implemented in current processor systems are typically unaware of how cache lines are allocated to multiple incoming application streams. When a processor issues a load/store request for a data block in a cache, for example, the processor only checks for the data block in the cache. That is, if the data block is not in the cache, the cache controller issues a request to the main memory. Upon receiving a response from the main memory, the cache controller allocates the data block into the cache.
In processor systems employing multi-threaded cores, multi-core processors, multi-tasked cores, and/or virtualized cores, multiple incoming application streams may interfere with each other and as a result, may cause a shared cache to operate inefficiently. With multiple incoming application streams sharing cache space with equal priority often results in sub-optimal allocation of cache resources to the more important memory intensive application(s).