Peripheral component interconnect Express (PCI Express or PCIe) is a high performance, generic and scalable system interconnect used for a wide variety of applications, such as a motherboard-level interconnect, a passive backplane interconnect, and an expansion card interface for add-in boards. The PCIe bus implements a serial, full duplex, multi-lane, point-to-point interconnect, packet-based, and switch-based technology. Current versions of PCIe buses allow for a transfer rate of 2.5 Giga bit per second (Gbps), 5 Gbps, or 8 Gbps, per lane, with up to 32 lanes.
The roundtrip time of a PCIe bus is a major factor in degrading the performance of the bus. As illustrated in FIG. 1A, the roundtrip is the time period elapsed from the transmission of data over a link 130, for example, by a PCIe root 110, to the acknowledgment of the data reception by a PCIe endpoint 120.
The roundtrip time of the PCIe bus 100 depends upon the delay of link 130 between the PCIe root 110 and the PCIe endpoint 120. Typically, this delay is due to an acknowledgement (ACK), and flow control update latencies, caused by the layers of a PCIe bus. Abstractly, the PCIe is a layered protocol bus, consisting of a transaction layer, a data link layer, and a physical layer.
The data link layer waits to receive an ACK signal for transaction layer packets during a predefined time window. If an ACK signal is not received during this time window, the transmitter (either at the PCIe root 110 or endpoint 120) resends the unacknowledged packets. This results in inefficient bandwidth utilization of the bus as it requires re-transmission of packets that do not have a data integrity problem. That is, high latency on the link 130 causes poor bandwidth utilization.
In addition, a typical PCIe bus includes a credit mechanism utilized to avoid a receiver buffer overflow. As the latency of a PCIe bus is typically low, the PCIe root 110 and endpoint 120 often implement small receiver buffers with a small number of credits. The fast PCIe link enables fast updates of flow controls (credits) and full bus performance. However, when the bus latency increases, the small number of flow control credits becomes a major limitation. Even if the receiver buffer is available, the flow control packet delay causes the transmitter (either at the PCIe root 110 or endpoint 120) to be idle for a long period prior to sending data. The result is an idle PCIe bus with low bandwidth utilization.
The PCIe protocol allows read and write operations. In the write operation issued between the PCI root and an endpoint, no feedback is required to wait for the completion of the operation. In addition, multiple write operations can be initiated in parallel. However, for a read operation a feedback is required, which indicates completion of the read operation. For example, when a PCIe's root memory reads data from an external disk (connected to the PCIe), the PCIe root should wait for a read completion message from the endpoint connected to the external disk prior to completing the read operation. In addition, only a limited number of read operations can be initiated.
In a typical architecture of a computing device, illustrated in FIG. 1B, a host central processing unit (CPU) 140 and a host memory 150 are connected to the PCIe root 110. In addition, to allow connectivity to at least one Universal Serial bus (USB) device 160, an eXtensible host controller interface (referred to hereinafter as a “host controller” 170) is coupled to the PCIe bus 130 and a USB device 160. The host controller 170 is a computer interface specification that defines a register-level description of a host controller for USB 1.x, 2.0, and 3.0 compatible devices. The communication between the host controller 170 and the PCIe root 110 is through a PCIe bus connection 130, and the connection between the USB device 160 and the PCIe root 110 is through a host controller 170 by means of a USB device.
A typical host controller 170 supports asynchronous and periodic data transfers between a host memory and the USB device. The periodic data transfers include isochronous and interrupt transfers, while the asynchronous data transfers include a “bulk” and control data transfers. The host controller 170 maintains the following operational rings: a) a command ring through which the software application executed by the host computer relays passes at least host controller related commands; b) an event ring through which command completion and asynchronous events are transferred to a software application; and c) a transfer ring through which the software application schedules the work items for a USB device 160 and transfers data between the host memory 150 and USB device 160.
Multiple command rings, event rings, and transfer rings can be maintained by the host controller 170. A ring is a circular queue of transfer request blocks (TRBs). A TRB is a data structure in the host memory 150 created by the software application. A TRB is used to transfer a single physically contiguous block of data between the host memory 150 and the host controller 170. The TRB includes a single data buffer pointer that points to the data in the host memory, the length of the data pointed by the TRB, a TRB type, and control information.
The TRBs are managed using Enqueue and Dequeue Pointers set to the address of the first TRB location in the ring. The Enqueue Pointer is managed by the software application and the Dequeue Pointer is managed by the host controller 170. The software application places items in a transfer ring at the Enqueue Pointer, and the host controller 170 executes the respective items from the transfer ring at the Dequeue Pointer. A cycle bit field in a TRB identifies the location of the Enqueue Pointer in a respective ring. Upon completion of the transfer of a TRB, the length and status of the transfer may be reported in a transfer event TRB.
In a typical PCIe bus architecture, the PCIe root 110 is directly coupled to the host controller 170. In fact, the PCIe root 110 and the host controller 170 are typically connected on the same electric board. Thus, the link 130 is a wired electric connection. The roundtrip time is usually very short and therefore the PCIe is not designed for operating properly in high latency. In contrast, a distributed peripheral interconnect bus connects a PCIe root and endpoints that are located remotely from each other. For example, such a distributed bus allows the connectivity between a PCI root and endpoints over a wireless medium.
When the link between the components of the PCIe bus is de-coupled, for example, to allow PCIe connectivity over a wireless medium, the latency of the link and response time of a PCI's bus components is significantly increased. As a result, the performance of the bus, especially when performing read operations, is severely degraded. As an example, performance of read operations in response to the latency of the bus is illustrated in FIG. 3, which illustrates that when the latency of a PCIe bus is 0 microseconds (μs) the utilization of the bus is 100%, and when the latency is increased to 100 microseconds (μs), the utilization of the PCIe bus is 30%.
Thus, it would be advantageous to provide a high performance interconnect bus that would allow efficient distributed connectivity.