A congruence class of an N-way associative cache can be over allocated due to multiple misses from a multiple issue processor to the same congruence class. The over allocation occurs where all of the “ways” of the congruence class have pending line fills when another miss occurs triggering another cache line allocation. For example, a 4-way associative cache may have 4 line fills pending (i.e., a pending line fill for each way) when another access arrives at the cache that would cause another allocation to occur.
Several existing solutions exist to deal with the over allocation. A first approach compares the addresses of the incoming requests to current requests being processed. The incoming requests are then held until results of the current requests are known. Each of the incoming requests is released in turn when safe to do so (i.e., the previous request results in a cache hit that did not cause an allocation). Otherwise, the hold is maintained on the incoming requests until a safe situation exists (i.e., after the line fill completes). The first approach avoids the over allocation by limiting the allocation to one per congruence class. However, performance is lost because some incoming requests that could be processed are held instead. The first approach prevents streaming of cache “hit” data while waiting to determine the hit or miss status of a current request for certain access patterns. For example, a user doing a series of loads to the same cache line would expect the data to be returned at the same frequency the loads were received (back-to-back) without the additional gaps caused by the holds.
A second approach sets a cache state to pending, instead of invalid, when an allocation takes place. The pending state would then remove the just-allocated cache line from a replacement policy calculation. Removal from the replacement policy calculation allows multiple requests for the same congruence class to be sent to the cache, avoiding the performance issues with the first approach. However, the second approach does not solve the issue of what to do when an allocation is triggered and all of the ways are in the pending state.
A third approach uses a separate status bit to indicate that a cache entry is in a pending state waiting for a line fill to complete, rather than encoding the pending state in the cache state bits. The second and third approaches solve the performance issue, but neither prevents the over allocation. Neither the second approach nor the third approach can determine what to do when all of the ways of a congruence class are marked pending.