Processing systems may comprise one or more levels of caches, configured, for example between a processor and a main memory. The processor may first access a level one cache (or “L1 cache”), and if there is a miss in the L1 cache for a cache line, a level two cache (or “L2 cache”), if available, may be consulted. If there is also a miss in the L2 cache for the cache line, a level three cache (or “L3 cache”), if available, may be consulted, and so on, until the cache line is found in a backing storage location such as a cache or main memory.
A processing system may implement several resources for servicing the various cache misses which may occur. To take advantage of instruction-level and memory-level parallelism, a plurality of resources may be provided to support servicing multiple cache misses at the same time. For example, buffers known as fill buffers may be provided for servicing cache misses in an L1 cache. The fill buffers may receive cache lines (e.g., missing cache lines from one or more backing storage locations), and the cache lines may be installed in the L1 cache from the fill buffers. There may be a limited number of ports through which the L1 cache may be accessed, and so, arbitration may be performed between cache lines held in multiple fill buffers before the cache lines are installed into the L1 cache. As can be appreciated, the resources provided (e.g., number of ports, the number of fill buffers, etc.) for servicing the L1 cache misses may be bounded by area and power considerations, as well as timing considerations (e.g., the latency incurred by possible arbitration processes which may be involved in servicing multiple L1 cache misses).
In situations such as a context switch, the L1 cache, for example, may experience a large burst in the number of cache requests, and correspondingly, the number of cache misses. While it is desirable to service the burst of cache misses quickly and efficiently (e.g., taking advantage of memory-level parallelism), conventional processing systems may only be able to service a limited number of cache misses at any given time due to the limited number of fill buffers and related resources available for servicing the cache misses. Any additional cache misses may be stalled until the fill buffers and other resources for servicing the additional cache misses become available.
Accordingly, the conventional processing systems are seen to be deficient in efficiently handling such situations (e.g., context switches) where the fill buffers and related resources, for example, are unavailable or busy and additional cache misses are waiting to be serviced. There is a corresponding need in the art to overcome these deficiencies.