The use of multiple processors or processors with multiple cores has become increasingly common as a method of increasing the computing power of new computer systems. Multiprocessor and multicore systems share system resources such as system memory and storage devices. Multiple processors or cores often access the same data in memory or storage devices and attempt to utilize this data at the same time. To accomplish this, multiprocessor and multicore systems track the use of data to maintain data coherency. One facet of maintaining data coherency in multiprocessor systems is ensuring that data cached in each processor is coherent. For example, each processor may alter data in its cache before writing it back to system memory. If another processor requests this data from system memory before the altered data is written back to memory, data coherency is lost.
A common scheme for maintaining data coherency in these systems is to use a snoop filter. To insure data coherency, a processor or core may send coherency requests, often referred to as snoops, to other processors before accessing or modifying data. The conventional snoop filter maintains a cache of data requests from each processor or core to track the contents of the cache of each processor or core. Each time a processor retrieves data from memory, a coherency record that includes a tag address for that data is stored in the snoop filter. However, the snoop filter is not aware of cache entries that have been evicted by a processor or core since it is impractical for a processor to send all cache-hit memory references to the snoop filter to maintain a perfect match between the processor's cache entries and the snoop filter entries. For example, a frequently referenced line from a processor may appear to the snoop filter to be aged since the line's activities are not exposed outside the inner cache hierarchy. In another scenario, a clean (unmodified) line in the processor's cache may be replaced by another cache miss address without the snoop filter being notified. As a result, the snoop filter may likely have many stale data entries that are no longer in use by the processor. Furthermore, to make room for new entries when a new request is received from a processor or core, the snoop filter may have to evict cache entries that may still be in use.
The cache entries that are selected to be evicted may be selected using a replacement algorithm. One replacement algorithm of the snoop filter randomly chooses an entry in the snoop filter cache to be evicted to make room for the new entry. This causes a back invalidation message to be sent to the processor or core for the evicted entry. However, if the evicted entry is still being used by the processor or core, the processor or core will need to request the corresponding data from system memory again. This generates additional traffic on the bus between processor or core and the hub controller, thereby reducing the available bandwidth for other data transfers.
To minimize the effect of this process on the bandwidth of the bus and the utilization of the processor, the snoop filter caches are typically large enough to track several times the combined sizes of all the caches in the processors covered by the snoop filter. In practice, the snoop filter may be four to eight times larger than the total size of the caches of the processors or cores in the system. These large snoop filters occupy a large amount of space and increase the complexity and consequently the cost of hub controllers. Consequently, selecting a good replacement policy is preferable over increasing the snoop filter size. Also, improvements to the issuance of back invalidations are also desired.