1. Technical Field
The present invention relates to a method and system for data processing in general, and in particular to a method and system for avoiding data loss within a computer system. Still more particularly, the present invention relates to a method and system for avoiding data loss due to cancelled transactions within a non-uniform memory access (NUMA) computer system.
2. Description of the Related Art
It is well-known in the computer arts that greater computer system performance can be achieved by harnessing the processing power of multiple individual processors in tandem. Multi-processor (MP) computer systems can be designed with a number of different topologies, of which various ones may be better suited for particular applications depending upon the performance requirements and software environment of each application. One of the most common MP computer topologies is a symmetric multi-processor (SMP) configuration in which multiple processors share common resources, such as a system memory and input/output (I/O) subsystem, which are typically coupled to a shared system interconnect. Such computer systems are said to be symmetric because all processors in an SMP computer system ideally have the same access latency with respect to data stored in the shared system memory.
Although SMP computer systems permit the use of relatively simple inter-processor communication and data sharing methodologies, SMP computer systems have limited scalability. In other words, while performance of a typical SMP computer system can generally be expected to improve with scale (i.e., with the addition of more processors), inherent bus, memory, and input/output (I/O) bandwidth limitations prevent significant advantage from being obtained by scaling a SMP beyond a implementation-dependent size at which the utilization of these shared resources is optimized. Thus, the SMP topology itself suffers to a certain extent from bandwidth limitations, especially at the system memory, as the system scale increases. SMP computer systems also do not scale well from the standpoint of manufacturing efficiency. For example, although some components can be optimized for use in both uniprocessor and small-scale SMP computer systems, such components are often inefficient for use in large-scale SMPs. Conversely, components designed for use in large-scale SMPs are impractical for use in smaller systems from a cost standpoint.
As a result, an MP computer system topology known as non-uniform memory access (NUMA) has emerged as an alternative design that addresses many of the limitations of SMP computer systems at the expense of some additional complexity. A typical NUMA computer system includes a number of interconnected nodes that each include one or more processors and a local "system" memory. Such computer systems are said to have a non-uniform memory access because each processor has lower access latency with respect to data stored in the system memory at its local node than with respect to data stored in the system memory at a remote node. NUMA systems can be further classified as either non-coherent or cache coherent, depending upon whether or not data coherency is maintained between caches in different nodes. The complexity of cache coherent NUMA (CC-NUMA) systems is attributable in large measure to the additional communication required for hardware to maintain data coherency not only between the various levels of cache memory and system memory within each node but also between cache and system memories in different nodes. NUMA computer systems do, however, address the scalability limitations of conventional SMP computer systems since each node within a NUMA computer system can be implemented as a smaller SMP system. Thus, the shared components within each node can be optimized for use by only a few processors, while the overall system benefits from the availability of larger scale parallelism while maintaining relatively low latency.
In designing a scalable cache coherent NUMA system, data coherency issues that do not exist in simpler SMP designs must be addressed. For example, in a single bus MP computer system, data loss will not occur when a transaction is cancelled on the system bus. Data loss can be thought of as a set of circumstances during which the only valid copy of a data element (such as a cache line) is lost from any or all caches or memories in the system. The cache coherency protocol of an SMP system is designed to prevent such a loss from occurring. If, for example, a read transaction is "retried" by a processor in an SMP system, the "retry" is visible on to all devices on the bus; the requester of the data, the provider of the data and all snoopers, before the data is actually sourced to the bus. This ensures that the data will not be discarded, and hence "lost," by a device which may have the only valid copy. It also ensures that none of the caches in the system will change their state as they would have done if the data had been provided. A single bus MP could also maintain data coherency with the existence of a protocol mechanism for "cancelling" a transaction. A transaction is "cancelled" when a device requests data but, before the data can be provided, the requester indicates that the data is no longer wanted. Transactions can be cancelled by devices other than the device that originated the transaction, for example a memory controller whose buffers are full. When a third party cancels the transaction in this way, the requester will re-issue the transaction only if the data is still required. The valid copy of data then is neither provided nor removed from the memory where it is resident. Although transaction cancellation is not a typical feature of an SMP system, one could include a cancellation mechanism without sacrificing coherency because all snoopers have simultaneous visibility to the transaction on the system bus.
Due to the potentially long latency of some transactions, a high performance NUMA system may find greater utility for a protocol mechanism to cancel a previously issued transaction. In a specific circumstance, a NUMA system may use a cancellation method to nullify a speculative fetch that is no longer needed. This would be the right thing to do because the processor should not have to waste resource to keep such a transaction pending, and to transfer the data would be a waste of valuable bus bandwidth. However, in a NUMA system, situations can occur transactions in which data may be lost during transaction cancellation unless measures are taken to detect and remedy such situations. Consider the case of a READ transaction issued to a remote processing node which is successful at the node which provides the data, but which is cancelled at the receiving node while the data from the remote node is still in transit. This may result in a loss of the only valid copy of the data, and hence the loss of data coherency. In the case described above, data loss results when the caches at the node providing the data change state before the transaction cancellation can be transmitted to the remote processing node. The cancellation cannot prevent the change of cache state as would happen in an SMP system because the cancellation originates on a physically different bus than that to which the read data is provided. The read transaction can complete successfully on one bus, triggering the state change of caches at that bus, before the transaction is cancelled at the node receiving the data, or before the cancellation can be communicated between the physically separate busses. Under these circumstances, the controller interfacing between these busses can be left with the only valid copy of data, in particular when the data is a modified copy of a cache line which has not yet been written to memory. Once the transaction is cancelled, a read request may never be issued for the data being held by the node controller, and as a result, the data will be lost, and memory will be inconsistent. This problem can occur in the course of any data transaction that causes modified data to be written to memory through the node controller. Consequently, it is necessary to provide a method and system for detecting and correcting these situations, avoiding loss of data and coherency.