Cache coherency is used to maintain the consistency of data in a distributed shared memory system. A number of agents, each usually comprising one or more caches, are connected together through a fabric or a central cache coherency controller. This allows the agents to take advantage of the performance benefit of caches while still providing a consistent view of data across agents.
Cache coherency protocols are usually based on acquiring and relinquishing permissions on sets of data, typically called cache lines containing a fixed amount of data (e.g. 32 or 64 bytes). Typical permissions are:                None: the cache line is not in the agent and the agent has no permission to read or write the data.        Readable: the cache line is in the agent and the agent has permission to read the cache line content stored locally. Multiple agents can simultaneously have read permission on a cache line (i.e. multiple readers).        Readable and writable: the cache line is in the agent and the agent has permission to write (and typically read) the cache line content. Only one agent can have write permission on a cache line, and no agent can have read permission at the same time.        
There is usually a backing store for all cache lines (e.g. a DRAM). The backing store is the location where the data is stored when it is not in any of the caches. At any point in time, the data in the backing store may not be up to date with respect of the latest copy of a cache line which may be in an agent. Because of this, cache lines inside agents often includes an indication of whether the cache line is clean (i.e. it has the same value as in the backing store) or dirty (i.e. it needs to be written back to the backing store at some point as it is the most up-to-date version).
The permission and “dirtiness” of a cache line in an agent is referred to as the “state” of the cache line. The most common set of coherency states is called MESI (Modified-Exclusive-Shared-Invalid), where Shared corresponds to the read permission (and the cache line being clean) and both Modified and Exclusive give read/write permissions, but in the Exclusive state, the line is clean, while in the Modified state, the line is dirty and must be eventually written back. In that state set, shared cache lines are always clean.
There are more complex versions like MOESI (Modified-Owned-Exclusive-Shared-Invalid) where cache lines with read permission are allowed to be dirty.
Other protocols may have separate read and write permissions. Many cache coherency state sets and protocols exist.
In the general case, when an agent needs a permission on a cache line that it does not have, it must interact with other agents directly or through a cache coherency controller to acquire the permission. In the simplest “snoop-based” protocols, the other agents must be “snooped” to make sure that the permission requested by the agent is consistent with the permissions already owned by the other agents. For instance, if an agent requests read permission and no other agent has write permission, the read permission can be granted. However, if an agent already has write permission, that permission must be removed from that agent before it is granted to the original requester.
In some systems, the agent directly places snoops on a bus and all agents (or at least all other agents) respond to the snoops. In other systems, the agent places a permission request to a coherency controller, which in turn will snoop the other agents (and possibly the agent itself).
In directory-based protocols, directories of permissions acquired by agents are maintained and snoops are sent only when permissions need to change in an agent.
Snoop filters may also be used to reduce the number of snoops sent to agents. Snoop filters keep a coarse view of the content of the agents and don't send a snoop to an agent if it knows that agent does not need to change its permissions.
Data and permissions interact in cache coherency protocols, but the way they interact varies. Agents usually place requests for both permission and data simultaneously, but not always. For instance, an agent that wants to place data in its cache for reading purposes and has neither the data nor the permission can place a read request including both the request for permission and for the data itself. However, an agent that already has the data and read permission but needs write permission may place an “upgrade” request to write permission, but does not need data.
Likewise, responses to snoops can include an acknowledgement that the permission change has happen, but can also optionally contain data. The snooped agent may be sending the data as a courtesy. Alternatively, the snooped agent may be sending dirty data that has to be kept to be eventually written back to the backing store.
Agents can hold permission without data. For instance, an agent that wants to write a full cache line may not request data with the write permission, as it knows it will not use it (it will override it completely). In some systems, holding partial data is permitted (in sectors, per byte . . . ). This is useful to limit data transfers but it makes the cache coherency protocol more complex.
Many cache coherency protocols provide two related way for data to leave an agent. One is through the snoop response path, providing data as a response to a snoop. The other is a spontaneous write path (often called write back or evict path) where the agent can send the data out when it does not want to keep it anymore. In some protocols, the snoop response and write back paths are shared.
Fully coherent agents are capable of both owning permissions for cache lines and receiving snoops to check and possibly change their permissions, triggered by a request from another agent. The most common type of fully coherent agent is a microprocessor with a coherent cache. As the microprocessor needs to do reads and writes, it acquires the appropriate permissions and potentially data and puts them in its cache. Many modern microprocessors have multiple levels of caches inside. Many modern microprocessors contain multiple microprocessor cores, each with its own cache and often a shared second-level cache. Many other types of agents may be fully coherent such as DSPs, GPUs and various types of multimedia agents comprising a cache.
In contrast, I/O coherent (also called one-way coherent) agents do not use a coherent cache, but they need to operate on a consistent copy of the data with respect to the fully coherent agents. As a consequence, their read and write request may trigger coherency actions (snoops) to fully coherent agents. In most cases, this is done by having either a special bridge or the central coherency controller issue the appropriate coherency action and sequence the actual reads or writes to the backing store if necessary. In the case of a small bridge, that bridge may act as a fully coherent agent holding permissions for a small amount of time. In the case of the central coherency controller, it tracks the reads and writes, and prevents other agent from accessing cache lines that are being processed on behalf of the I/O coherent agent.
In a coherency controller, most of the silicon area is used for a data store. The data store is typically used to reorder data coming from snoop responses with data coming from targets. The conventional implementation is to use a data store with enough read data buffer capacity to store almost all data that can be returned by the maximum number of pending read requests in the system. This requires a very large coherency controller since the number of pending reads can be large.
Therefore, what is needed is a coherency controller where the read data buffer is sized more modestly, using the characteristics of the traffic to dynamically allocate buffers from a pool with less capacity than the amount of data than can be returned by the total number of pending read transaction requests.