1. Field of the Invention
This invention is related to the field of computer systems and, more particularly, to computer systems wherein certain write operations may be considered completed by a source upon transmission (i.e., posted write operations).
2. Description of the Related Art
Generally, personal computers (PCs) and other types of computer systems have been designed around a shared bus system for accessing memory. One or more processors and one or more input/output (I/O) devices are coupled to memory through the shared bus. The I/O devices may be coupled to the shared bus through an I/O bridge which manages the transfer of information between the shared bus and the I/O devices, while processors are typically coupled directly to the shared bus or are coupled through a cache hierarchy to the shared bus.
Unfortunately, shared bus systems suffer from several drawbacks. For example, the multiple devices attached to the shared bus present a relatively large electrical capacitance to devices driving signals on the bus. In addition, the multiple attach points on the shared bus produce signal reflections at high signal frequencies which reduce signal integrity. As a result, signal frequencies on the bus are generally kept relatively low in order to maintain signal integrity at an acceptable level. The low signal frequencies reduce signal bandwidth, limiting the performance of devices attached to the bus.
Lack of scalability to larger numbers of devices is another disadvantage of shared bus systems. As mentioned above, the available bus bandwidth is substantially fixed (and may decrease if adding additional devices causes a reduction in signal frequencies upon the bus). Once the bandwidth requirements of the devices attached to the bus (either directly or indirectly) exceeds the available bandwidth of the bus, devices will frequently be stalled when attempting access to the bus. Overall performance of the computer system including the shared bus will most likely be reduced.
On the other hand, distributed memory systems lack many of the above disadvantages. A computer system with a distributed memory system includes multiple nodes, two or more of which are coupled to different memories. The nodes are coupled to one another using any suitable interconnect. For example, each node may be coupled to each other node using dedicated lines. Alternatively, each node may connect to a fixed number of other nodes, and transactions may be routed from a first node to a second node to which the first node is not directly connected via one or more intermediate nodes. A memory address space of the computer system is assigned across the memories in each node.
In general, a xe2x80x9cnodexe2x80x9d is a device which is capable of participating in transactions upon the interconnect. For example, the interconnect may be packet based, and the node may be configured to receive and transmit packets. Generally speaking, a xe2x80x9cpacketxe2x80x9d is a communication between two nodes: an initiating or xe2x80x9csourcexe2x80x9d node which transmits the packet and a destination or xe2x80x9ctargetxe2x80x9d node which receives the packet. When a packet reaches the target node, the target node accepts the information conveyed by the packet and processes the information internally. Alternatively, a node located on a communication path between the source and target nodes may relay the packet from the source node to the target node.
Distributed memory systems present design challenges which differ from the challenges in shared bus systems. For example, shared bus systems regulate the initiation of transactions through bus arbitration. Accordingly, a fair arbitration algorithm allows each bus participant the opportunity to initiate transactions. The order of transactions on the bus may represent the order that transactions are performed (e.g. for coherency purposes). On the other hand, in distributed systems, nodes may initiate transactions concurrently and use the interconnect to transmit the transactions to other nodes. These transactions may have logical conflicts between them (e.g. coherency conflicts for transactions involving the same address) and may experience resource conflicts (e.g. buffer space may not be available in.various nodes) since no central mechanism for regulating the initiation of transactions is provided. Accordingly, it is more difficult to ensure that information continues to propagate among the nodes smoothly and that deadlock situations (in which no transactions are completed due to conflicts between the transactions) are avoided.
For example, certain deadlock conditions may occur in Peripheral Component Interconnect (PCI) I/O systems if xe2x80x9cpostedxe2x80x9d write operations are not allowed to become unordered with respect to other operations. Generally speaking, a posted write operation is considered complete by the source when the write command and corresponding data are transmitted by the source (e.g., by a source interface). A posted write operation is thus in effect completed at the source. As a result, the source may continue with other operations while the packet or packets of the posted write operation travel to the target and the target completes the posted write operation. The source is not directly aware of when the posted write operation is actually completed by the target.
In contrast, a xe2x80x9cnon-postedxe2x80x9d write operation is not considered complete by the source until the target (e.g., a target interface) has completed the non-posted write operation. The target generally transmits an acknowledgement to the source when the non-posted write operation is completed. Such acknowledgements consume interconnect bandwidth and must be received and accounted for by the source. Non-posted write operations may be required when the write operations must be performed in a particular order (i.e., serialized).
When a source must accomplish multiple write operations, and the write operations need not be completed in any particular order, it is generally preferable from a system performance standpoint to accomplish the write operations as posted write operations. Situations may arise, however, where the posted write operations need to be properly ordered within their targets with respect.to other pending operations such that memory coherency is preserved within the computer system before processing operations within the source may be continued.
It would thus be desirable to have a computer system implementing a special operation which provides assurance to the source that all posted write operations previously issued by the source have been properly ordered within their targets with respect to other pending operations. The computer system may have, for example, a distributed memory system, and the special operation may help preserve memory coherency within the computer system.
A computer system is presented which implements a xe2x80x9cflushxe2x80x9d operation providing a response to a source which signifies that all posted write operations previously issued by the source have been properly ordered within their targets with respect to other pending operations. The flush operation helps to preserve memory coherency within the computer system.
In one embodiment, the computer system includes a processing subsystem and an input/output (I/O) node. The processing subsystem includes multiple processing nodes interconnected via coherent communication links. Each processing node includes a processor preferably executing software instructions. Each processing node may include, for example, a processor core configured to execute instructions of a predefined instruction set. One of the processing nodes includes a host bridge. The I/O node is coupled to the processing node including the host bridge via a non-coherent communication link. The I/O node may be part of an I/O subsystem including multiple I/O nodes serially interconnected via non-coherent communication links.
The processing subsystem may include, for example, a first processing node, a second processing node, and a memory coupled to the first processing node. Either the first or second processing node may include the host bridge, and the I/O node may thus be coupled to the first or second processing node. The I/O node may generate a xe2x80x9cnon-coherentxe2x80x9d posted write command in order to store data within the memory. As defined herein, a non-coherent command is a command issued via a non-coherent communication link. The second processing node may include a cache, and the processing subsystem may be operated such that memory coherency is maintained within the memory and the cache.
When write operations need not be completed in any particular order, the I/O node may generate non-coherent posted write commands due to the performance advantage of posted write operations over non-posted write operations. The non-coherent posted write operation has a target within the processing subsystem. The target may be, for example, a processing node coupled to a memory including an address or range of addresses of the non-posted write operation. In response to a non-coherent posted write command received from the I/O node, the host bridge is configured to generate a corresponding xe2x80x9ccoherentxe2x80x9d posted write command within the processing subsystem. As defined herein, a coherent command is a command issued via a coherent communication link. The host bridge may include translation logic for translating the non-coherent posted write command to the coherent posted write command.
The host bridge includes a data buffer for storing data used to track the status of non-coherent posted write commands received from the I/O node. The data buffer may be used to store coherent transaction data associated with the coherent posted write command and non-coherent transaction data associated with the non-coherent posted write command. The coherent transaction data may include a source tag assigned to the coherent posted write command by the host bridge, and the non-coherent transaction data may identify the transaction as a posted write command and the source of the non-coherent posted write command.
The I/O node issues a flush command to ensure that all previously issued non-coherent posted write commands have at least reached points of coherency within the processing subsystem. A point of coherency is reached with regard to a specific non-coherent posted write command when: (i) the corresponding coherent posted write command is properly ordered within the target with respect to other commands pending within the target, and (ii) a correct coherency state with respect to the coherent posted write command has been established in the other processing nodes.
The host bridge issues a non-coherent target done response to the I/O node in response to: (i) the flush command, and (ii) coherent target done responses received from all targets of coherent posted write commands resulting from non-coherent posted write commands previously issued by the I/O node. A given target may transmit a coherent target done response when the coherent posted write command has at least reached the point of coherency within the processing subsystem. The non-coherent target done response from the host bridge signals the I/O node that all non-coherent posted write commands previously issued by the I/O node have at least reached points of coherency within the processing subsystem. In response to the flush command and the coherent target done response from the target, the host bridge may use the coherent and non-coherent transaction data stored within the data buffer to issue the non-coherent target done response to the I/O node.
As described above, the I/O node may be part of an I/O subsystem including multiple I/O nodes serially interconnected via non-coherent communication links. The non-coherent posted write command may have a source within the I/O subsystem, and may be completed by the source in response to transmission of the non-coherent posted write command by the source. Posted and non-posted commands may travel in separate virtual channels in order to prevent deadlock situations within the computer system. As a result, the non-coherent posted write command may be conveyed within a posted command virtual channel of the I/O subsystem, and the posted command virtual channel may be separate from a non-posted command virtual channel of the I/O subsystem.
In one embodiment of a method for ensuring a posted write common originating within an I/O subsystem of a computer system reaches a point of coherency within a processing subsystem of the computer system, the I/O subsystem provides the posted write command to the host bridge of the processing subsystem. The host bridge translates the posted write command to a coherent posted write command, and transmits the coherent posted write command to a target within the processing subsystem. The I/O subsystem provides a flush command to the host bridge. The host bridge provides a target done response to the I/O subsystem in response to: (i) the flush command, and (ii) a target done response received from the target.
An I/O node within the I/O subsystem may be a source of the posted write command, and a processing node within the processing subsystem may be the target. The posted write command provided by the I/O subsystem to the host bridge may be a non-coherent posted write command. The posted write command provided by the I/O subsystem to the host bridge, the coherent posted write command, the flush command, the target done response received by the host bridge from the target, and the target done response provided by the host bridge to the I/O subsystem may be transmitted as one or more packets. The target done response from the target signifies that the coherent posted write command has at least reached the point of coherency within the processing subsystem. The target done response from the host bridge signals the I/O subsystem that the previous posted write command has at least reached a point of coherency within the processing subsystem.