1. Field of the Invention
This invention generally relates to data processing systems, and more specifically, to maintaining data coherence in multi-node data processing systems.
2. Background Art
Large-scale shared memory multi-processor computer systems typically have a large number of processing nodes (e.g., with one or more microprocessors and local memory) that cooperate to perform a common task. For example, selected nodes on a multi-processor computer system may cooperate to multiply a complex matrix. To do this in a rapid and efficient manner, such computer systems typically divide the task into discrete parts that each are executed by one or more of the nodes.
When dividing a task, the nodes often share data. To that end, the microprocessors within the nodes each may access the memory of many of the other nodes. Those other microprocessors could be in the same node, or in different nodes. For example, a microprocessor may retrieve data from the memory of another node. Also, rather than retrieving the data from another node each time the data is needed, a microprocessor may store and access its locally held copies (cached copies) of data to perform local functions.
Problems may arise, however, when the data that held by one microprocessor changes, and another microprocessor that uses the data has not been notified of the change. When that happens, the locally held data may no longer be accurate, potentially corrupting operations that rely upon the retrieved data. To mitigate these problems, computer systems that share data in this manner typically execute cache coherency protocols to ensure that all copies of the data are consistent.
As data processing systems grow larger, operations for maintaining system-wide coherence incur larger latencies as these operations involve round-trips to all units participating in the coherence mechanism. Modern systems employ optimization schemes to limit the propagation of coherence traffic to a subset of units, wherever it is possible to detect that it is sufficient to limit the coherence check to that Subset. A simple example is node pumping in a large system comprised of several nodes, where if a node can detect that a certain line is currently confined to caches within that node, then it is sufficient to send an invalidation message only to caches within that node, in order to gain exclusive access to that line. Often such tracking is complex and tends to be speculative.