While the configurations of processors and memory systems vary from one computer system to another, one processor-memory interconnect topology relatively common today is referred to as the distributed shared memory interconnect topology. In distributed shared memory systems, each processor usually contains a local memory controller. The local memory for each individual processor may be mapped into a global memory space in order to provide a single memory space available to all processors. In order to maintain coherency, one or more system components must generally snoop all caches in the system. Instead of comprehensively snooping all caches, some systems may instead filter some of the snooping activities. Depending on the interconnect topology and the snoop latency of caches in the system, system and/or memory performance could be limited by snooping activities.
To facilitate information transfer to and from a processor, many systems use an Input-Output (I/O) agent While I/O agents typically contain small caches, referred as write caches, the I/O agents nonetheless run at much lower frequencies than the processors, respond to snoops in a relatively slow fashion, and frequently tend to limit the speed of memory accesses in computer systems. Since I/O agents usually contain relatively small amounts of memory lines, such as 128 or 256 lines, there is usually a high probability that I/O agents do not contain a memory line that may be requested by another system agent. Consequently, most I/O snoops will be clean and the snooping operations tend to unnecessarily increase the memory latency in a system.
Several techniques have been proposed to filter snoops to system agents, such as I/O agents. The I/O Agent Directory is a scheme in which each line in memory is tagged with bits to indicate whether an I/O agent contains a particular memory line or not. Since this scheme requires bits for each line in memory, it may waste memory space. Such memory waste is often not desirable in small memory systems. In another scheme, each processor tracks each memory line in an I/O agent using a 1-1 relationship. Unfortunately, this technique is not very scalable as the number of agents in a system, such as I/O agents, grows. Additionally, as the sizes of I/O agent write caches increase, the size of the filters in the processors must grow.