Source-based snooping protocols have recently emerged as a useful technique for latency reduction in a small-scale link-based multiprocessor system. These protocols are effective because on a miss to a block of data in a memory within the system, such as block B, in the last level cache, the missing processor, processor S, typically sends a snoop (also known as a probe) to each of the remaining processors in the system. A snoop is a query sent from a first processor to a second processor to check if the cache of the second processor has a particular piece of data.
In response to the snoop, the processors check their caches to determine if the cache of any one of the processors has the requested data, block B. If the cache of a processor N has block B, processor N may send a copy of block B to processor S, where processor N and processor S are distinct processors. This is a relatively fast cache-to-cache transfer, and the latency experienced by processor S is generally less than would otherwise occur if processor S were to retrieve block B from the memory. Typically, there are various policies to ensure that if more than one processor has copies of block B, only one processor may deliver a copy of block B to processor S.
These source-based snooping protocols use a large amount of network bandwidth within the multiprocessor system. One reason is that each miss typically generates a separate snoop for each processor in the system (except the missing processor). This increase in network traffic causes increased link utilization, which in turn causes increased latencies. Moreover, the amount of network bandwidth used increases even more as the number of processors in the system increases.