As the bandwidth demand increases for modern computer systems, the traditional shared bus architecture becomes increasingly difficult to scale. High performance systems are moving towards a packet oriented point-to-point inter-connection.
In one background packet switching system, most of the coherent transactions may be finished out-of-order and strongly ordered writes issued and retired one at a time. However, one at a time sequential processing within such system limits performance thereof. To enhance the streaming performance of strongly ordered writes, another background approach would be to resort to tracking every strongly ordered write in the system fabric using multiple messages between each switch element or fork to retire the writes in the proper order. While this approach does allow out-of-order executing of strongly ordered write streams, it adds considerable message overhead and complexity due to potential retries of ordered writes.
Coherent nodes with coherent ordered write streams can issued out-of-order read-for-ownership (RFO) transactions and thus achieve high streaming performance, but the streaming depth is limited to the buffer size at the node and the buffers in the system fabric are not efficiently utilized. Due to a quirk of the Microsoft OS implementation, even uncacheable (UC) transactions like UC writes and USWC writes have to be treated as coherent writes due to cache attribute aliasing. Thus all writes to memory mapped devices are effectively strongly ordered coherent writes.
A typical PC system cannot afford the cost of a fully coherent IO node (south bridge). The south bridge in a PC is best described as semi-coherent. While a south bridge can issue streams of coherent reads with no coherent buffers or caches, writes from the south bridge are from PCI bus/bridges and strongly ordered. If those strongly ordered writes are transferred one at a time, the system performance will be extremely poor.
Attention is directed to U.S. Pat. No. 6,356,983 B1 issued to Parks on 12 Mar. 2002, a background section of which provides good discussion concerning some background caching and coherency approaches.