1. Field of the Invention
The present invention generally relates to an input/output (I/O) hub. More particularly, the present invention relates to an I/O hub that is adapted to implement Producer-Consumer (P/C) ordering rules across an interface that is inherently unordered in a multi-processor computer system architecture.
2. Discussion of the Related Art
Multi-processor computer systems are designed to accommodate a number of central processing units (CPUs), coupled via a common system bus or switch to a memory and a number of external input/output devices. The purpose of providing multiple central processing units is to increase the performance of operations by sharing tasks between the processors. Such an arrangement allows the computer to simultaneously support a number of different applications while supporting I/O components that are, for example, communicating over a network and displaying images on attached display devices. Multi-processor computer systems are typically utilized for enterprise and network server systems.
An input/output hub may be provided as a connection point between various input/output bridge components, to which input/output components are attached, and ultimately to the central processing units. Many input/output components are Peripheral Component Interconnect (PCI) (xe2x80x9cPCI Local Bus Specification, Revision 2.1, Jun. 1, 1995, from the PCI Special Interest Group (PCI-SIG)) devices and software drivers that adhere to the PCI Producer-Consumer (P/C) model and its ordering rules and requirements. (xe2x80x9cPCI Local Bus Specificationxe2x80x9d, Revision 2.1, Appendix E, xe2x80x9cSystem Transaction Orderingxe2x80x9d.) For example, these ordering rules allow writes to be posted for higher performance while ensuring xe2x80x9ccorrectnessxe2x80x9d. Posting means that the transaction is captured by an intermediate agent, e.g., a bridge from one bus to another, so that the transaction completes at the source before it actually completes at the intended destination. Posting allows the source to proceed with the next operation while the transaction is still making its way through the system to its ultimate destination. In other words, write posting in a PCI device means that the writes that are issued are not expected to return a xe2x80x9ccompletexe2x80x9d response. That is, when posted writes are issued, there is no confirmation returned indicating that the write is completed. The term xe2x80x9ccorrectnessxe2x80x9d implies that a flag or semaphore may be utilized to guard a data buffer between a Producer-Consumer pair.
Coherent interfaces interconnecting the I/O hub and, ultimately, to the processors, are inherently unordered. Therefore, ordering rules under the P/C model are more restrictive than those for a coherent interface, which may have no ordering rules at all. Coherent interfaces, such as a front-side bus or an Intel Scalability Port, are inherently unordered because the processors for which the coherent interface was designed are complex devices. These processors have the intelligence to distinguish when ordering is required and when it is not. Therefore, in general, coherent interfaces can treat completions independently of requests (in either direction). PCI devices, however, are generally not this complex and are more cost-sensitive, and therefore rely on the system ordering rules to avoid deadlocks. PCI ordering rules do allow some flexibility in relaxing the ordering requirements of specific transactions, though.
It is particularly beneficial to retain the use of PCI devices and devices that follow the P/C ordering model, as they are generally designed toward cost-sensitivity. Accordingly, what is needed is a cost-effective optimized chipset implementation that bridges an ordered domain (one which requires PCI ordering and follows the P/C ordering model) and an unordered domain, such as a coherent interface in connection with a plurality of processor units, without any additional software or hardware intervention. Because a PCI device is generally designed towards cost-sensitivity and may not exploit the relaxations in the PCI ordering rules, there is a need for a system that can exploit the performance optimizations allowed with the PCI ordering rules by employing all of the ordering relaxation capabilities on behalf of these devices, while at the same time avoiding any deadlock vulnerabilities and performance penalties.