The invention relates to write posting in multi-processor systems. More particularly, the invention relates to methods and apparatus for write posting with global ordering in a multi-processor system having multiple paths from multiple processors to a single I/O device.
Computer systems include a means for CPUs to transmit data to input-output (I/O) units. The transit latency for such transactions is typically high, in the hundreds or thousands of CPUs cycles. In order to maximize the performance of sequences of such transactions, it has become commonplace to employ pipelining, also known as write-posting, in this particular context. With write-posting, a CPU may emit a successor write before the preceding write has progressed all the way to its destination.
Certain challenges to maintaining system ordering can emerge in the context of write-posting. For example, bus protocols frequently include a xe2x80x9cretryxe2x80x9d mechanism. If two successive posted writes are sent, and the first is retried, the writes may arrive at their destination in the opposite order in which they were transmitted.
Furthermore, in multiple CPU systems, it is generally necessary for writes from different CPUs sent to the same destination to be coordinated. Assuming some ordering relationship between CPUs established by some mechanism, such as a memory semaphore, I/O writes emitted by a CPU ordered first, arrive at the destination before those from a CPU ordered second.
Both of these problems are easily solved in single-bus systems, that is systems in which there is only one path from any CPU to each I/O device. The xe2x80x9cretryxe2x80x9d problem can be solved by retrying all writes to the same destination once the first is retried, coupled with a protocol requirement that only the oldest retry is reissued until it is accepted. The second situation cannot arise by definition; if there is only one path, arrival order must equal issue order.
Various means have been used to solve the above problem in multi-path systems. One is the use of a special xe2x80x9csyncxe2x80x9d operation. Here, I/O writes are posted as for a single-path system. But before a handoff permission for a different CPU to begin writing to a device is received, the previously writing CPU issues the xe2x80x9csyncxe2x80x9d. The xe2x80x9csyncxe2x80x9d acts as a probe to assure that all paths from the issuing CPU to all possible destinations have ben drained. When the probe indicates it is complete, the first CPU indicates to the second CPU, via a semaphore or interrupt, that it may proceed. An ordering fence, either implied or explicit, is used between the sync and the proceed indication. Performance and complexity are the drawbacks to this approach. Complexity increases with attempts to restrict the scope of the sync to the minimum necessary. Performance decreases, the more xe2x80x9cheavyweightxe2x80x9d and simple the sync operation is.
Another approach is to follow the last write from the first CPU with a read from that CPU to the same destination. An implicit or explicit fence can separate the return of that read with the indication for the next CPU to proceed. The disadvantage here is with software constraints and complexity. Legacy software written for single-path systems where this is not necessary can be difficult to modify. The transfer of control may be managed by a layer of software that is isolated from that doing the writes, thus, the necessity of the read or where to direct it is difficult to determine. An example of this is OS pre-emption and process migration.
Accordingly, a need exists for an apparatus to accommodate write posting with global ordering in multiple path systems.
An apparatus and method consistent with the present invention for permitting write posting with global ordering in a multipath system. The apparatus and method including a bus adapter having an input port to receive one or more operations from a processor. A queue controlled by the bus adapter to buffer information from the one or more operations. A control circuit, coupled to the queue, to generate an output signal that relates to the information from the one or more operations. The output signal is transmitted to the processor. An interconnect fabric, coupled to each bus adapter, to transmit the one or more operations. A device, connected to the interconnect fabric, to receive the transmitted operations where the device sends an acknowledgment signal to the processor upon receiving the transmitted operation.