The present invention relates generally to transactions on a computer interconnect and, more specifically, to the ordering of read and write transactions on a computer bus.
FIG. 1 shows the architecture of a typical computer system 8 in which a high-speed bus, such as the PCI bus 10, interconnects several I/O device adapters 12, 14. Each I/O device adapter 12, 14 is either an initiator or a target and the PCI bus (or PCI-X bus) serves to carry read and write transactions between the I/O device to which the adapter is connected. The CPU 16 for the computer system is connected to the bus 10 by means of a bridge device 18 which also provides a path between the CPU 16 and main memory 20. Another bridge device 22 connects a slower bus 24 to which devices, such as a printer adapter 26, and keyboard and mouse interfaces 28, are connected.
In one version of the PCI bus 10, an initiator (master) connects to a target (slave) via the bus to perform a transaction. FIG. 2 shows a typical PCI read transaction 40, a write transaction 42, and a retry request 44. Read transactions include an address phase 46, a command phase 48, one or more data phases 50a-d and attribute phases 52a-d. Each of the data phases 50a-d can be delayed by the target or initiator for a specific number of clocks in order to match the data transfer speed of the target to the initiator. Write transactions are similar, having an address phase 54, a command phase 56, one or more data phases 58a-d, and attribute phases 60a-d. The initiator or target can stall a data phase (via wait states) for up to seven clocks. (The target can stall the start of the first data phase for up to 15 clocks). Before the initiator can connect to a target to perform a data transaction, the initiator must become the owner of the bus. This implies that the initiator must be the winner of an arbitration process.
In addition to data transactions, the PCI bus supports Delayed Transactions for reads and writes. A Delayed Transaction has two parts, the request part and the completion part. In the first part 44 in FIG. 2, the initiator performs an address phase 62 and command phase 64, and before the first data phase 66, the target responds with a disconnect 68, as shown in FIG. 2. The initiator interprets the target disconnect to be a retry request 44, which the initiator honors by ending the current transaction (FIG. 2), returning the bus to the idle state, re-arbitrating for ownership and re-initiating the transaction. If the initiator again receives a retry indication from the target, the initiator repeats the above sequence 44. Thus, the address phase 62, may be repeated several times (causing multiple re-arbitrations as well), until the target is ready to transfer data. In the completion part of the Delayed Transaction, the initiator performs an address phase and the target replies with a data transfer rather than a disconnect 68.
It is easily appreciated by one skilled in the art that the above-described operation of the PCI bus is exceedingly inefficient. Throughput on the PCI bus is lost for two reasons, the insertion of wait states and the use of the retry protocol.
Wait states cause a direct loss in throughput. Just one wait state inserted in each data phase is a 50% loss in throughput during the data burst. This means that for a 32 bit PCI bus clocked at 33 MHz, the throughput during the data phase is reduced to 66 Megabytes per second from 132 Megabytes per second. If the bus were clocked at 66 MHz, the throughput loss is even greater—a full 132 Megabytes per second of loss. For devices that can sustain transfer rates of about 1 Gigabyte per second, the bus is simply unworkable.
The Delayed Transaction protocol also causes a significant loss in throughput because bus cycles that could be used for data transfers are used to support a high-overhead protocol. Bus cycles are wasted when the target replies with a disconnect, when the initiator ends the current transaction, lets the bus go idle, re-arbitrates for the bus, and initiator then re-performs the address phase of the disconnected transaction. Thus, the cost of each retry is at least 6 clocks, 4 clocks to return the bus to the idle state, at least one clock for arbitration, and at least one more clock for an address phase. During these 6 clocks an entire 4 dword burst could have occurred.
In both of these cases throughput was lost because the target was not ready to respond. Clocking the bus faster to improve the throughput only causes more clock cycles to be lost due to wait states and the inefficient Delayed Transaction protocol.
An updated version of the PCI bus, PCI-X, was developed to address these and other deficiencies. In the PCI-X specification, wait states are not permitted once data transfers have begun. A data burst, once started, must proceed at full speed on the bus. The read, write and split request transactions for PCI-X are shown for reference in FIG. 3. Each transaction type has an Address/Cmd phase, followed by an attribute phase and a response phase. After the response phase a data transfer ensues. Only the response and first data phase are extensible by adding a limited number of wait states. After the first data phase, the remaining data phases must proceed at one bus clock per data phase.
Additionally, in the PCI-X specification, the inefficient Delayed Transactions have been replaced by Split Transactions, as shown in FIG. 3.
In a Split Transaction, a Requester initiates a transfer 75 by performing an address/cmd phase 86, an attribute phase 88, a response phase 90, an unused data phase 92 and a surrender phase 94. Upon receiving a Split Response Request 96 from a Completer at the appropriate time in the transaction, the Requester removes itself from the bus, commits resources to the transaction and suspends the transaction until the Completer responds. This makes the bus available for use to other Requesters and Completers in the interim. When the data transfer is ready to occur at the Completer, the Completer acts as an Initiator, obtaining the bus and performing a Split Completion transaction 71, which includes an address phase 70, an attribute phase 72, a response phase 74 and one or more data phases 76a-d during which the requested data is transferred to the Requester.
To make it easier to conform to the newer PCI-X specification and to improve the performance of the older PCI protocol, it is best that both the Requester and Completer are implemented with read and write storage buffers so that when a write or read data burst is ready to occur, it can proceed at full bus speeds. Additionally, both the Requester and Completer are likely, in most implementations, to have Initiator and Target interfaces to carry out the Split Transaction protocol and each interface is required to be registered on both inputs and outputs.
However, the use of read and write buffers and Initiator and Target interfaces on the adapter increases the chances that PCI and PCI-X read/write ordering and deadlock avoidance rules may not be met.
PCI bus ordering rules require that if write data is posted to a write buffer (such as a posted-write buffer in a PCI-to-PCI or host/PCI bridge) the data must be flushed to its final destination (memory) before a read of that same data is allowed by the same or different bus master. Also, a bridge must perform all posted writes in the same order in which they were originally posted and is only permitted to post writes to regular memory targets.
On the PCI-X bus, there are more extensive read-write ordering rules when buffers are involved because of Split Transactions. For example, for bridges between a PCI-X bus and a host bus or between two PCI-X busses, there are three sets of rules, as set forth, in summary, below. The rules are set forth in more detail on pages 573-577 of PCI-X System Architecture, Tom Shanley, ISBN 0-201-72682-3, which is incorporated by reference into the present application.
Case I. A posted memory write transaction (PMW) is received in a bridge.                (i) a subsequent split read request (SRR) or split write request (SWR) cannot be reordered to avoid returning incorrect read data (SRR) and to maintain write ordering (SWR);        (ii) a subsequent split read completion (SRC) generally cannot be reordered to avoid returning incorrect read data;        (iii) a subsequent split write completion may be permitted because the writes are in different directions; and        (iv) a subsequent PMW generally cannot be performed until the first PMW is completed and PMWs must complete in the order received to maintain write ordering;        
Case II—A split read request (SRR) or split write request (SWR) occurs at a bridge.                (i) a subsequent split read request or split write request can be reordered;        (ii) a subsequent split read completion, or split write completion or posted memory write must be allowed ahead of the SRR or SWR to avoid a deadlock;        
Case III—A split read completion (SRC) or a split write completion (SWC) occurs at a bridge.                (i) a subsequent split read request (SRR), split write request (SWR), split read completion (SRC) or split write completion (SWC) can be reordered;        (ii) a posted memory write must go ahead of the SRC or SWC to avoid a deadlock.        
Though there are some exceptions to these rules if a relaxed ordering (RO) bit is set in a transaction, the rules set forth the ordering of reads and writes so as to guarantee the proper operation of system software (Case I) and the avoidance of deadlocks (Cases II and III) for a bridge between the host bus and a PCI-X bus or between two PCI-X busses.
The use of read and write buffers and both an Initiator and Target interface on adapter units connected to the PCI bus along with the split transaction protocol creates a need to maintain certain ordering of read and write transactions on the adapter units.
Currently, one method of dealing with the ordering problem is to control all of the read-write transaction activity from a single thread, thereby serializing all of the transactions from a single point of control. While this may assure that the ordering problem is correctly addressed, the single thread approach is performance limiting both to the adapter and the system.
Therefore, there is a need to address the ordering problem on an adapter that has read and write buffers, Initiator and Target interfaces that connect to a strongly-ordered bus, such as the PCI or PCI-X bus, and a split transaction protocol, without using a single thread to serialize all of the activity of the adapter. The present invention is directed towards such a need.