1. Technical Field
An embodiment of the invention pertains generally to processor systems, and in particular pertains to scalable processor systems.
2. Description of the Related Art
With the rapid evolution of the Internet requirements for enterprise server systems have become increasingly diverse. Front-end and departmental servers are very cost and power sensitive while back-end servers that traditionally run database type applications require the highest level of performance along with multi-dimensional scalability and 24×7 availability. This segmentation of the server platforms has led to the development of a multitude of chipsets. A chipset encompasses the major system components that move data between the main memory, the processor(s) and the I/O devices. System vendors have designed separate chipsets with different system architectures to address the needs of different server segments or use industry standard components to address the needs for low-end systems and design proprietary components for mid-range and high-end systems.
Current systems have memory connected to a processor to store data such as data that is accessed with streams. A stream is a contiguous sequence of requests from an agent typically connected to the processor and memory system via a chipset or the like. The memory may include dynamic random access memory and the requests are processed in the same order as they are received. Processing the requests in the same order that they are received reduces memory bandwidth when, for example, a page replace conflict or DIMM turn around conflict forces a transaction to wait for a prior transaction to finish. Further, current systems provide a single path for cache coherency operations and data transfer, causing cache coherency transactions to wait for data transfers, increasing snoop latency.
Coherent transactions limit the bandwidth for transactions from a peripheral input-output (I/O) bus in processor-based systems such as desktop computers, laptop computers and servers. Processor-based systems typically have a host bus that couples a processor and main memory to ports for I/O devices. The I/O devices, such as Ethernet cards, couple to the host bus through an I/O controller or bridge via a bus such as a peripheral component interconnect (PCI) bus. The I/O bus has ordering rules that govern the order of handling of transactions so an I/O device may count on the ordering when issuing transactions. When the I/O devices may count on the ordering of transactions, I/O devices may issue transactions that would otherwise cause unpredictable results. For example, after an I/O device issues a read transaction for a memory line and subsequently issues a write transaction for the memory line, the I/O device expects the read completion to return the data prior to the new data being written. However, the host bus may be an unordered domain that does not guaranty that transactions are carried out in the order received from the PCI bus. In these situations, the I/O controller governs the order of transactions.
The I/O controller places the transactions in an ordering queue in the order received to govern the order of inbound transactions (transactions toward main memory and/or processors) from an I/O bus, and waits to transmit the inbound transaction across the unordered interface until the ordering rules corresponding to each transaction are satisfied. However, issuing transactions one at a time as the transaction satisfies ordering rules may limit the latency of a transaction to a nominal latency equal to the nominal snoop latency for the system. In addition, when multiple I/O devices transmit coherent transactions to the I/O controller, transactions unnecessarily wait in the ordering queue for coherent transactions with unrelated ordering requirements. For example, in conventional systems, a read transaction received subsequent to a write transaction for the same address will wait for the write transaction to issue even though the read transaction may have issued from a different I/O device, subjecting the read transaction to ordering rules independent from the ordering rules of the write transaction. As a result, the latency of the snoop request, or ownership request, for the write transaction adds to the latency of the read transaction and when a conflict exists with the issuance of the ownership request for the write transaction, the latency of the write transaction, as well as the read transaction, will be longer than the nominal snoop latency for the system.
I/O devices continue to demand increasing bandwidth, increasing the amount of time transactions remain in an ordering queue. For example, in conventional products, the number of delays resulting from a foreseeable read transaction that waits to access a memory line across the unordered interface and a read transaction that waits for a write transaction to satisfy ordering requirements when the write transaction will write to a different memory line, can escalate in proportion with bandwidth.
Elements shown in the Figures are presented as examples, and do not show all embodiments that are possible.