Processing systems typically implement one or more compute complexes, each compute complex having multiple processor cores and a cache hierarchy which has two or more levels of caches. A latency period is associated with the time between when a processor core requests data and the time the requested data is received. To minimize the time that processor cores spend idling and waiting for data, many processing systems use cache memory in the cache hierarchy to store temporary copies of program instructions and data. In the cache hierarchy, each processor core is associated with one or more levels of caches that are private to a corresponding core (hereinafter, the “private caches”). The processing system further implements a shared cache at another level of the cache hierarchy, wherein the shared cache is shared among the processor cores of the compute complex (hereinafter, the “shared cache”).