Serial-link I/O interconnect protocols, such as PCI-Express, are utilized in computing and communication platforms to interconnect components of the platforms to enable efficient I/O transfers between the components. The PCI Express protocol has a clear defined traffic flow concept. The core of the traffic flow concept is the use of buffers and credits. Buffers are available in order to better manage sending and receiving data through the PCI Express Link. Credits, on the other hand, are control flow units that define how much data can be received by the opposite side of the PCI Express Link.
FIG. 1 is a schematic diagram of a common PCI Express Link. The PCI Express Link is dual simplex, meaning that each module has a transmit side (TX) and a receive side (RX) side running independently. As shown in the figure, the TX side of Module 1 is the RX side of Module 2.
Each module contains an RX side buffer that is able to accept the incoming stream of data from the opposite TX side. Each module also contains a TX side buffer that is able to send the right data transfer type (Posted, Non-posted, or Completion) over the PCI-Express Link. The RX and TX buffers are required for the improved performance of the PCI Express Link as well as the speed matching between the protocol's core frequency and the chip's system clock frequency.
In operation, each application (module) advertises to the opposite side how much data it can receive. Available credits are advertised per transaction type. PCI Express distinguishes between 6 credit types: posted data credits, posted header credits, non-posted data credits, non-posted header credits, completion data credits and completion header credits. The PCI-Express credit scheme is well explained in the PCI Express Base Specification 1.0a, which is incorporated herein by reference. As described in the Specification, separate credit types are required for each of the transaction types to avoid potential data deadlocks in the transmission of the different transaction types across the PCI-Express link.
FIG. 2 is a more detailed schematic diagram of Module 1 of the common PCI Express Link of FIG. 1. An application 102 typically has multiple internal clients (Client 1, Client2, . . . ) that communicate with other modules (not shown) over the PCI Express link 104 through the PCI-Express core 106. Generally, a communication of data is transmitted from a client to the application arbiter 110, which, based on an arbitration protocol, prioritizes the communications and transmits them to the synchronization buffer 112. Each client typically includes a buffer to optimize the data flow in the case the application arbiter 110 is busy. The synchronization buffer 112 serves as clock boundary crossing device, since the clock (not shown) of the PCI Express core 106 typically runs much faster than the application internal clock. Therefore, the normal flow of the application data starts with a client sending data from its internal buffer through the application arbiter 110 to the synchronization buffer 112. Synchronization buffer 112 data is then transferred to the PCI Express core 106 through the application interface 108. The application interface 108 is the single link between the application 102 and the PCI Express core 106.
The PCI Express core 106 is responsible for propagating the data from the application 102 through the PCI Express link 104 TX interface 134 to its peer (not shown) on the opposite side of the PCI Express link 104. Under the rules and regulations of the PCI Express Protocol, the PCI-Express core 106 is allowed to send the data only if the peer indicates that it has a sufficient number of credits available for that particular transfer. For example, if the PCI Express core 106 wants to send a memory write transaction through the link 104, it has to make sure that there are enough posted header credits and posted data credits available for that particular transfer. A sufficient availability of those credits basically means that the opposite side of the PCI Express link 104 provided enough storage for receiving and safely preserving the incoming packet. On the other hand, not enough credits means that the PCI Express core 106 is not allowed to send the packet as it might cause a hazard at the opposite side. The packet acceptance in that case is most unlikely and it will cause an error condition by the receiver of the packet.
It is very common for a client to send a transfer for which the PCI Express core 106 does not have enough credits to effectuate the transfer. However, the PCI Express core 106 might have enough credits to enable a data transfer for the next client using different set (type) of credits. In that case, the PCI Express core 106 would not be able to accept the first packet from the first client nor the second packet from the second client. The first packet would be “stuck” on the application interface 108 until enough credits are available to enable the PCI Express core 106 to accept the data transfer from the first client. The second application's packet would also have to be stalled as it cannot propagate through the application interface 108 before the first packet is gone. This packet-after-packet lock situation might cause a system deadlock as the opposite side of the PCI Express link 104 could be expecting the second packet in order to free up credits for the first packet. The first packet, on the other hand, cannot go though as it does not have enough credits. This is a deadlock case.
The PCI Express standard addresses the deadlock issue by providing reordering rules. In order to be able to implement the reordering rules the PCI Express core 106 implements 6 FIFO buffers i.e. posted data buffer 114a, posted header buffer 114b, non-posted data buffer 116a, non-posted header buffer 116b, completion data buffer 118a and completion header buffer 118b. This enables the PCI Express core 106 to handle a large number of transactions from the application 102 without having to stall the application interface 108.
While this type of PCI Express architecture is common, it requires very large PCI Express core buffers 114-118. Without very large buffers 114-118, the deadlock case threat still exists. For example, if one of the PCI Express core buffers 114-118 becomes full such that no other transactions of that type can pass, the application interface will become stalled.
In addition to the memory consumption disadvantage there is a disadvantage dealing with expedited packet handling. If, for example, one of the application clients is latency sensitive and its packets need to be handled quickly, it will not receive the required attention through the PCI Express core 106 even if the application 102 could use a synchronization buffer bypass mode in order to quickly transmit its packets to the application interface 108. The packets to be expedited at the application level will be stacked into one of the PCI Express FIFOs, depending on the packet type. It will be transferred only when all of the previous packets of the same type are transferred to the opposite side of the link 104. Consequently, if the PCI Express FIFO buffers 114-118 are very large in order to minimize the deadlock case threat described above, the latency delay for the low latency packets is increased.