Multi-processor systems that comprise hierarchical store through cache structures have an increasing number of private store-through caches vying for access to shared embedded dynamic random access memory (EDRAM) caches. This generally results in a large amount of store traffic to the shared EDRAM cache that must be quickly processed to prevent store queues from backing up and holding up exclusive invalidates sent by other processors. Complicating this requirement is the utilization of the EDRAM for a large cache with a longer cache busy time. This translates to a longer interleave wait time and higher potential for live locks when competing with other requestors targeting the same interleaves.