A computing system typically includes a computing processor, peripheral devices, and a peripheral interconnect for facilitating communication between the computing processor and the peripheral devices. In many contemporary computing systems, the peripheral interconnect has a packet based switching architecture. One such type of peripheral interconnect that conforms to a PCI Express (PCIe) standard is sometimes referred to as a PCIe interconnect. A PCIe interconnect includes one or more serial communication channels, each of which is capable of transmitting packets in both directions of the serial communication channel. Moreover, the serial communication channels of the PCIe interconnect may be combined to create a parallel interface of independently controlled serial communication channels, each of which is often referred to as a lane.
In some contemporary computing systems, a PCIe interconnect includes a root complex and a fanout switch connected to the root complex. The root complex is connected to a computing processor and the fanout switch is connected to endpoints. In operation, an endpoint transmits packets to a downstream ingress queue of the fanout switch and the downstream ingress queue forwards the packets to an upstream egress queue of the fanout switch. Additionally, the fanout switch transmits packets from the upstream egress queue to the root complex and the root complex transmits the packets to the computer processor for processing.
In these types of computing systems, a packet may be a posted packet or a non-posted packet. If a packet received by the computing processor from the root complex is a posted packet, the computing processor terminates the packet. Otherwise, if a packet received by the computing processor from the root complex is a non-posted packet, the computing processor terminates the non-posted packet and transmits a corresponding completion packet to an upstream ingress queue of the fanout switch. The upstream ingress queue provides the completion packet to a downstream egress queue of the fanout switch and the downstream egress queue provides the completion packet to the endpoint that initially transmitted the non-posted packet to the fanout switch.
In one type of PCIe interconnect, a downstream bandwidth from the root complex to the fanout switch is higher than an upstream bandwidth from an endpoint to the fanout switch. In this type of PCIe interconnect, the root complex may transmit completion packets to the upstream ingress port of the fanout switch at a data rate that is faster than a data rate at which the endpoint transmits corresponding non-posted packets to the downstream ingress port of the fanout switch. Moreover, if the endpoint transmits a stream of non-posted packets upstream to the fanout switch at an upstream data rate, the root complex may transmit a stream of corresponding completion packets downstream to the fanout switch at an initial downstream data rate that is higher than the upstream data rate.
If the stream of non-posted packets is sufficiently large, the downstream egress queue of the fanout switch fills to capacity causing the upstream ingress queue of the fanout switch to also fill to capacity. In this way, the upstream ingress queue and the downstream egress queue become congested. Because the upstream ingress queue is congested, the initial downstream data rate at which the root complex transmits the stream of completion packets to the fanout switch decreases and the downstream bandwidth from the root complex to the fanout switch is underutilized. Moreover, the initial data rate tends to decrease to the upstream data rate at which the endpoint transmits the stream of non-posted data packets to the fanout switch.
Because of the congestion at the upstream ingress port of the fanout switch, a completion packet may be delayed from reaching a second endpoint that has transmitted a corresponding non-posted packet to the root complex through the fanout switch. Additionally, the second endpoint competes for access to the upstream egress queue of the fanout switch, which may become congested because of non-posted packets of the stream of non-posted packets being transmitted to the root complex through the fanout switch by the first endpoint. If the upstream egress port of the fanout switch becomes congested, the non-posted packet transmitted by the second endpoint may be delayed in entering the upstream egress queue causing the corresponding completion packet to be further delayed in reaching the second endpoint.
In light of the above, a need exists for reducing congestion in a packet switch. A further need exists for reducing congestion in a packet switch when a stream of non-posted packets and a corresponding stream of completion packets are transmitted through the packet switch.