Processing systems often utilize a cache hierarchy for each processing node in the system. Each cache hierarchy includes multiple levels of caches available for access by one or more processor cores of the nodes. To maintain intra-node and inter-node coherency, such systems often employ probes that communicate requests to access blocks of data or status updates between various caches within the system. The volume of such probes can impact the performance of a processing system. As such, the cache hierarchy often employs a cache coherence directory (also commonly referred to as a “probe filter”) that tracks the coherency status of cachelines involved in the cache hierarchy and filters out unnecessary probes, and thus reduces system traffic and access latency.
One common implementation of a probe filter is a page-based cache coherence directory that tracks groups of contiguous cachelines, with these groups frequently referred to as “cache pages.” Thus, the cache coherence directory has a set of entries, with each entry available to store status information for a corresponding cache page for a given cache. However, due to cost and die size restrictions, the size of the cache coherence directory is limited, and thus the number of cache page entries of the cache coherence directory is limited. As such, the cache coherence directory may not be able to have a cache page entry available for every cache page that may have a cache line cached in the cache hierarchy, particularly in systems utilizing large level 2 (L2) or level 3 (L3) caches. When the cache coherence directory becomes oversubscribed, the cache coherence directory must selectively evict cache pages to make room for incoming cache pages by deallocating the corresponding cache page entry of the evicted cache page. The deallocation of a cache page entry in the cache coherence directory triggers a recall of the cachelines of the associated evicted cache page, which results in all of the data associated with the evicted cache page being made unavailable from the cache hierarchy. A subsequent request for data in the cache page therefore will necessarily require a memory access to obtain the requested data, which incurs a considerable access latency. Conventional approaches to reducing such recalls include either increasing the size of the cache coherence directory or increasing the size of the cache pages. However, increasing the size of the cache coherence directory increases die size and power consumption, which may be impracticable. Increasing the cache page size increases the likelihood of cache coherence directory oversubscription when executed workloads use relatively few cachelines from each cache page.