Synchronous digital circuits use a single timing reference to drive an entire circuit. However, synchronous clocking of large digital circuits or systems including multiple circuits can be expensive and complex as clock distribution becomes more and more difficult. Instead, large digital circuits are typically partitioned into subsystems. Each subsystem generates its own local clock. Data communication between different subsystems takes place asynchronously. If all local clocks are near a system clock, the circuits are called plesiochronous circuits. In such circuits, local clocks are typically within a few hundred parts per million (ppm) of the system clock.
An example of a plesiochronous circuit/system is a PCI Express transmitter/receiver pair interconnecting two PCI Express compliant devices. PCI Express is described in PCI Express Base Specification, Revision 1.0a, the contents of which are hereby incorporated by reference. As detailed therein, each PCI Express device has its own local clock. Devices exchange data by way of serial lines, referred to as lanes. Each lane carries a serial bit stream using a pair of differential signals. Several lanes are combined to form a PCI Express link.
A clock is embedded within each serial bit stream using the 8B/10B encoding scheme. The use of differential signals is advantageous because it offers better noise immunity from electromagnetic interference (EMI).
Embedding the clock in the data stream using 8B/10B allows a reduction in pin count as a dedicated clock line is avoided. However, as a separate clock is embedded in each lane, the lanes are not synchronized to each other, and the multiple receiver/transmitters operate in a plesiochronous manner.
Advantageously, unlike in a synchronous link, the length of traces used by the lanes need not be matched. However, as with all plesiochronous circuits, the received (recovered) clock is not the same as the local clock at the receiver. The deviation of the recovered clock from the local clock is called clock jitter. The clock jitter must be compensated for in the receiver design.
To this end, recovery of the embedded clock signal from the received serial stream is typically achieved by using a phase-locked-loop (PLL) circuit that extracts the clock using the frequent 1-to-0 and 0-to-1 logic level transitions guaranteed by the 8B/10B coding.
Once the embedded clock is recovered from the bit stream, the recovered clock (Rx Clock) may be used to write the transmitted bits into a plesiochronous elastic store (PES)—an elastic buffer circuit that operates using separate clocks for writing to and reading from the buffer. Such an elastic buffer circuit allows transmitter and receiver using different clocks to smoothly exchange data. The recovered clock is used to write to the elastic buffer. A local clock is used to clock out the bits from the elastic buffer (PES) and from then on, the data is synchronized to the rest of the receiver circuit.
As there can be a difference of up to a few hundred ppm in the frequencies of the recovered clock and the local clock, an overflow or an underflow may occur if there is a mismatch between the clock used to read (the local clock) from the buffer and the clock used to write (the recovered clock) to the buffer.
Overflow and underflow is compensated by “bit-stuffing”—that is, the transmitter periodically sends clock compensation sequences within the data stream. The elastic buffer deletes these sequences when the buffer is about to become full. Conversely the elastic buffer inserts clock compensation sequences, if the buffer is about to run out of data. Clock compensation sequences are later removed from the data stream.
These requirements make the design of the elastic buffer complex. The elastic buffer needs to detect imminent underflows and overflows; detect clock compensation sequences; insert clock compensation sequences; delete clock compensation sequences and perform related tasks. Moreover, these functions must be accomplished in two clock domains—one clock domain for the circuitry associated with writing to the buffer and a different one for reading from the buffer. This often leads to a complex state machine for the elastic buffer which in turn uses many gates to implement and may consume more clock cycles in operation.
A PCI Express receiver must also address the problem of lane-to-lane synchronization, referred to as lane skew, in addition to clock recovery and synchronization with the local clock. Specifically, when a multi-lane link is used to transmit data, symbols on different lanes arrive at different times, even when they are transmitted simultaneously, due to traces having different lengths, impedance differences, and the like. The receiver thus, needs to recognize data that was transmitted simultaneously on different lanes, and align them in order to reconstruct the transmitted packet accurately at the receiver.
Accordingly, there remains a need for a simplified receiver capable of synchronizing incoming data streams with the local clock and accurately processing the received data.