Communication developments in the last decade have demonstrated what seems to be a migration from parallel data input/output (I/O) interface implementations to a preference for serial data I/O interfaces. Some of the motivations for preferring serial I/O over parallel I/O include reduced system costs through reduction in pin count, simplified system designs, and scalability to meet the ever increasing bandwidth requirements of today's communication needs. Serial I/O solutions will most probably be deployed in nearly every electronic product imaginable, including IC-to-IC interfacing, backplane connectivity, and box-to-box communications.
Although the need for increased communication bandwidth continues to drive future designs, support for other communication attributes, such as reduced lane-lane skew for multiple lane communication systems and modes of operation with low or precisely known latency, remain important as well. As an example, the PCI-Express (PCIe) standard specifies that the lane-lane skew for a multiple lane transmission system is not to exceed approximately 2 unit intervals (UI), i.e., 2 bit periods, across any of the transmission lanes.
In programmable logic device (PLD) implementations of multiple lane transmission systems, data is provided to multiple transmission lanes by the core, i.e., fabric, of the PLD. The fabric of the PLD also provides a fabric clock signal that is utilized to propagate the fabric data through the physical coding sublayer (PCS) of each transmission lane. The physical medium attachment (PMA) layer receives the fabric data from the PCS, serializes the fabric data, and then transmits the serialized fabric data to the receiving end of the transmission lane.
Since the clock signal used to clock fabric data into the serializer of the PMA is not necessarily phase aligned with the fabric clock signal that is used to clock fabric data through the PCS, a first-in, first-out (FIFO) is generally utilized to provide “elasticity” between each PCS/PMA pair. In particular, while the PCS clocks fabric data into the FIFO using the fabric clock signal, the fabric data is non-coherently clocked out of the FIFO into the PMA's serializer. As a result, the FIFO facilitates dependable data transfer between the PCS/PMA pair despite the lack of timing coherency between each PCS/PMA pair, since the memory depth of the FIFO is generally effective to absorb the timing discrepancies that may exist between the PCS/PMA pair.
Latency is added into the transmission lane, however, when “elastic” components such as a FIFO are used to insure dependable PCS/PMA data transfer. The amount of added latency, however, is not precisely known. Elastic components, therefore, cannot be utilized when certain I/O standards, such as the Common Public Radio Interface (CPRI) standard, are implemented within the PLD, since the resulting latency of the transmission data path cannot be maintained or determined within specification.
Alternate PCS/PMA phase alignment techniques are, therefore, required, since in the absence of an elastic component, phase coherency between the PCS/PMA pair is not otherwise ensured for dependable data transfer. Conventionally, a phase aligner is utilized, whereby the fabric clock signal that is used to write fabric data into the PCS data register is phase aligned with the PMA clock signal that is utilized to transfer data from the PCS data register to the serializer of the PMA. Such methods to ensure phase coherency between the PCS/PMA pair have, therefore, simply removed the elastic components from each transmission lane.
As previously noted, another requirement of certain I/O standards is that lane-lane skew across multiple transmission lanes be maintained within specification, for example, not exceeding approximately 2 UI for the PCIe standard. In order to minimize lane-lane skew across multiple transmission lanes, a global clock tree is utilized, whereby leaf nodes of the global clock tree provide the fabric clock signal to each PCS. The clock tree is balanced, so that the fabric clock signals provided by each leaf node to each PCS may be substantially aligned to one another. By utilizing a phase aligner in each transmission lane, the respective PCS/PMA pairs may be made to be coherent with one another, while the global clock tree is utilized to minimize lane-lane skew across the multiple transmission lanes.
With conventional techniques for low latency modes of operation without a FIFO, however, failure modes may be caused to exist within the serial transmitter(s) when process, voltage, and temperature (PVT) and/or other modes of variation occur over time. In particular, while phase coherency between a PCS/PMA pair may be achieved at one PVT corner with appropriate phase alignment settings, phase coherency may be lost at another PVT corner due to non-uniform, PVT induced variation in clock phases between the PCS/PMA pair.
Stated differently, PVT variation over time may cause non-uniform changes in the substantial global clock tree that provides the fabric clock signal to each PCS, as compared with the smaller, localized clock generation and distribution within the PCS/PMA pair. As a result, previously calibrated phase coherency may fall outside of specified performance limits or reliable operating tolerance due to PVT variation over time.
While continuous operation of the phase aligner within each transmission lane would achieve PCS/PMA coherency with minimized lane-lane skew across all PVT corners, conventional placement of the phase aligner within each transmission lane contemporaneously increases induced phase jitter beyond acceptable limits. Alternate placement of the phase aligner is effective to reduce phase jitter to within acceptable limits, but does not otherwise provide a viable multiple lane transmission architecture due to the inability to adequately control lane-lane skew.
Through duplication of phase alignment circuitry, therefore, PCS/PMA coherency with minimized lane-lane skew across all PVT corners may be achieved, while contemporaneously maintaining phase jitter to within acceptable limits. Such an implementation, however, disadvantageously increases required circuitry, which results in increased power consumption and increased semiconductor die area usage.
Efforts continue, therefore, to solve the lane-lane synchronization challenge of low latency, multiple transmission communication systems without increasing phase jitter, power consumption, or semiconductor die area usage.