Typically, cache includes memory between a shared system memory and execution units of a processor to hold information in a closer proximity to the execution units of the processor. In addition, cache is typically smaller in size than a shared main system memory, which allows for more expensive faster memory, such as Static Random Access Memory (SRAM). Both the proximity to the execution units and the speed allow for caches to provide faster access to data and instructions. Caches are often identified based on their proximity from execution units of a processor. For example, a first-level (L1) cache may be close to execution units residing on the same physical processor. Due to the proximity and placement, first level cache is often the smallest and quickest cache. A computer system may also hold higher-level or further out caches, such as a second level cache, which may also reside on the processor but be placed between the first level cache and main memory, and a third level cache, which may be placed on the processor or elsewhere in the computer system, such as at a controller hub, between the second level cache and main memory.
When a processor requests an element, such as a data operand or instruction, from memory, the cache is checked first to see if the element resides in the cache and may be provided quickly to execution units without waiting to fetch the element from main memory. Currently, caches are typically unaware of how cache lines are allocated to multiple incoming application streams. When a processor issues a load/store request for a data block in a cache, for example, the processor only checks for the data block in the cache. That is, if the data block is not in the cache, the cache controller issues a request to the main memory. Upon receiving a response from the main memory, the cache controller allocates the data block into the cache. Often, selection of a cache line to replace with the newly retrieved block of data is based on a time or use algorithm, such as a Last Recently Used (LRU) cache replacement algorithm.
In processor systems employing multi-threaded cores, multi-core processors, multi-tasked cores, and/or virtualized cores, multiple incoming application streams may interfere with each other and as a result, may cause a shared cache to operate inefficiently. For example, a low priority incoming application stream may be associated with a lower priority level then a priority of a higher priority application stream. However, the low priority incoming stream may provide more allocation requests, which potentially monopolizes the cache, i.e. evicts lines associated with the high priority application stream, which may degrade the performance of the high priority application stream.