1. Field of the Invention
The present invention relates generally to an improved data processing system, and in particular to a computer implemented method, data processing system, and computer program product for maintaining cache coherency for a multi-node system using a specialized bridge which allows for fewer forward progress dependencies.
2. Description of the Related Art
A multi-processing system comprises a plurality of central processing units within a single computer system. A multi-processing system often contains 8 to 256 processing elements, wherein the processing elements are organized into groups called “nodes”. FIG. 1 provides an illustration of an example node 102. Node 102 is shown to comprise processing elements 104 and 106. A “processing element” is one logical attachment point to the bus, often having 200 to 500 wires, and typically consists of one shared Level2 cache 108 to which one or more processors are attached. Each processor within a processing element typically has its own non-shared Level1 cache, such as processor 110 and Level1 cache 112. Each Level1 cache is logically placed between the processor and Level2 cache 108.
A node is typically 4 to 16 processing elements and a system is typically 2 to 16 nodes. The nodes are organized into two levels of hierarchy. At the bottom level of the hierarchy is a node with typically 5 to 20 devices attached to it, 4 to 16 of which are processing elements and the rest are memory controllers and I/O bridges. A node has one bus controller 114 which contains the bus arbiter(s). The top level of the hierarchy is an interconnection of multiple nodes. Note that both levels of hierarchy may be implemented with other hierarchies of buses.
Since a cache is a local copy of a portion of memory, a processor can access a cache more quickly than memory to enhance performance. However, because processors in a multi-processing system can share data, the processors can access the same data fields and the same portions of memory. This access includes writing to the data, which can change the content of the memory. Consequently, if a processor has a local copy of data and another processor writes to the data in memory, there must be some mechanism to ensure that the local copy of data that is cached is updated to reflect the write.
Cache coherency is a process for ensuring that local caches of data are kept current and up-to-date while other processors may be writing to the data in memory. Snooping is a common mechanism for maintaining cache coherency. Snooping comprises a process which is initiated any time there is a read or write to memory. Read and write transactions have two phases on the bus—a command phase and a data phase. The command phase includes information such as the address, the length of the transaction, and command type (e.g., read or write), while the data phase includes the contents of the associated address. When snooping, the command portion (address portion) of the read or write is looked up in the cache, rather than the data portion. If the command portion and the cache have a same location in memory, a resolution process is initiated to ensure all devices have a consistent view of the data.