1. Field of the Invention
This invention relates to computer system design and in particularly to multi-node coherency protocols and methods for expediting the establishment of system coherency.
2. Description of Background
If we examined a typical strong store ordered, symmetric multiprocessing computer system consisting of a plurality of nodes, interconnected through a given bus topology, with a bus protocol that exchanges address, data, and coherency information, we would normally observe that the return of full exclusive ownership of a line to a given processor for an exclusive fetch request does not occur until after a protracted series of communications is completed to ensure that all processors in the system have given up ownership to the requested line before the requesting core can modify the line.
This communication typically includes an initial address request launch from a requesting node to all target/remote nodes, which is primarily done in order to determine/snoop the remote cache directory states. At the same time as the remote directory lookups, the remote processors are typically notified that they must give up ownership of the line though a cross-invalidate request and an intermediate/partial response is typically sent back to the requesting node from each remote nodes, such that the requesting node is able to determine the overall system coherency for the given line. This information is then sent to each of the remote nodes as a combined response, in order to ensure that proper coherency handling of the lines on each remote cache is managed correctly. Finally after this response is received on each remote node and coherent handling of the line is completed, the remote nodes send a completion/final response back to the requesting node indicating that they have completed processing and reset. Upon receiving the completion response from all of the remote nodes, the requesting node returns exclusivity of the line to the requesting processor.
As a result of waiting for this coherency communication between a plurality of nodes, the requesting node can incur an indeterminate latency penalty, as the response time for all of the responses within a typical system vary depending on the type of activity within each remote node and the activity within a given processor, as the processor may reject the request for it to give up exclusive ownership a line any of a number of times if it is actively modifying the line.