1. Field of the Invention
The present invention broadly relates to high speed serial link communications systems, and more particularly, to an architecture, system, and method of re-synchronizing multiple serial link channels.
2. Description of the Prior Art
The need for high bit-rate Inputs/Outputs (I/Os) becomes increasingly necessary for inter-chip, chip-to-chip, chip-to-system, board-to-board, and chassis-to-chassis signaling interfaces as the demand for off-chip signal bandwidth grows. Packaging technology limitations confine both the width of off-chip parallel busses and the number of external I/O pins. High-speed serial link communication is an alternate signaling approach to wide parallel data busses and increased I/Os. High-speed serial links are used for chip-to-chip, board-to-board and chassis-to-chassis connections. In chip-to-chip connections, both chips reside on the same board, and the distance between the two is usually less than 10 inches. In this case all high-speed input/output (I/O) cells are integrated inside the chip, which puts new requirements, such as low power and small die size, on the I/O cells.
High-speed serial link communication techniques multiplex and de-multiplex data onto and off of high-speed serial communication channels, thus reducing hundreds of parallel connections to a few serial connections. Industry-wide adoption of high-speed serial communication schemes has been prevalent. For example, on Jul. 23, 2002 the Peripheral Component Interconnect Special Interest Group (PCI-SIG) approved the PCI ExpressTM Specification for High-Performance Serial I/O. Unlike PCI and PCI-XTM, which are based on 32-bit and 64-bit parallel buses, respectively, the PCI ExpressTM specification uses high-speed serial link technology similar to that found in Gigabit Ethernet, Serial ATA (SATA), and Serial-Attached SCSI (SAS). PCI ExpressTM reflects an industry trend to replace legacy shared parallel buses with high-speed point-to-point serial links.
Conventional high speed serial link communication systems typically comprise one or more serial link transmitters, one or more serial link receivers, and a communication channel linking each transmitter/receiver pairing. For example, FIG. 1 illustrates a conventional high speed serial link communications system comprising four serial transmitters 112-118, four serial receivers 122-128 and communication links/channels 130 for coupling the transmitters to the receivers.
High-speed serial link transmitters serialize parallel data received from a chip and drive the serialized data onto a serial link. High-speed serial link receivers receive the transmitted signals from the serial link, recover an encoded clock signal and the serialized data from the received signals, and de-serialize the data. As such, a receiver must perform some form of equalization, clock recovery, data recovery, and de-serialization. The communication channels carry the serial data from the transmitters to the receivers.
Both high-speed serial link transmitters and receivers include circuit for controlling the timing of internal operations. Conventionally, such transmitters and receivers incorporate clock domains for distributing timing signals. Clock domains are known in the art to refer to a plurality of circuits, such as latches, flip/flops, and the like, which are controlled by the same or similar clock signals, having identical frequency and enable/disable control, and similar phase alignment (sufficient for setup and/or hold time interaction within the domain). Each clock domain includes one or more clock trees, Phase-Locked-Loop (PLL) circuits, clock repeaters, and the like.
Clock domains are typically extended simply by driving a clock signal from each of the aforementioned links to a physically centralized point, where each of the clocks operate separate latches which capture the central “sampling latch” output. In effect, this physically extends the internal clock domain of each link to the centralized point.
Timing circuit contained in high-speed serial link transmitters and receivers also provides global timing synchronization for communications systems incorporating multiple high-speed serial links grouped together to form a parallel communication channel. Global timing control and synchronization is critical for multi-link communication systems in order to maintain data integrity (e.g., all system data is transmitted and received at an expected point in time, not one or more clock cycles early or late). For example, FIG. 1 illustrates a multi-link communications system 100 including four serial links grouped together to form a communications bus 130. Transmitters 112-118 transmit serial data streams onto the communication bus 130 and receivers 122-128 receive the transmitted data as previously described. Each transmitter and receiver can reside on a separate chip or can be grouped together as one or more cores. For example, transmitters 112-118 can each reside on a separate chip or can be grouped together as a core, that core residing in a single chip.
In a multi-link communications system such as system 100, parallel streams of serialized data transmitted over bus 130 have a specific timing relationship that must be maintained during transmission and reception in order to preserve data integrity. Thus, each communication channel must maintain a particular timing relationship to the other channels, otherwise data integrity may be compromised. As such, timing is critical not only for the internal operations of a particular serial link connection (transmitter, channel, and receiver timings), but timing is also critical for global synchronization between all channels in a multi-link communications system. Without global timing synchronization, system data integrity will be compromised.
PLLs are commonly one component utilized in multi-link communication systems for maintaining timing synchronization. PLLs synchronize the phase and frequency of a Voltage Controlled Oscillator (VCO) to an input reference clock. There are a number of components that comprise a PLL to achieve this phase alignment. A PLL compares the rising edge of a reference input clock to a feedback clock using a phase-frequency detector (PD). The PD produces an up or down signal that determines whether the VCO needs to operate at a higher or lower frequency. The PD output is applied to a charge pump and loop filter, which produces a control voltage for setting the frequency of the VCO. If the PD transitions to an up signal, then the VCO frequency will increase. If the PD transitions to a down signal, then the VCO frequency will decrease.
The loop filter converts these high and low signals to a voltage that is used to bias the VCO. If the charge pump receives a logic high on the up signal, current is driven into the loop filter. If the charge pump receives a logic high on the down signal, current is drawn from the loop filter. The loop filter filters out glitches from the charge pump and prevents voltage over-shoot, which minimizes VCO jitter. The voltage from the charge pump determines how fast the VCO operates. Divider and/or multiplier circuits can be inserted in the feedback loop to make the VCO frequency some multiple of the input reference frequency, making the VCO frequency output fVCO=(m×fREF)/n, where m is the divide ratio, n is the multiply ratio and fREF is the input reference frequency. Therefore, the feedback clock, which is applied to one input of the PD, is locked to the input reference clock, which is applied to the other input of the PD.
When phase relationship between clock signals is a factor, PLL resynchronization may be required. For example, PLL resynchronization is routinely required to resynchronize the various timing circuits contained within a multilink communication system in order to maintain data integrity. The timing circuit contained within a multilink communication system may require resynchronization for a number of reasons, for example, noise, jitter, loss of PLL lock, link-to-link skew, clock skew, PLL phase error, etc.
PLL resynchronization resets a PLL and resynchronizes the PLL with an input reference clock. Typically, PLL resynchronization involves the assertion of an asynchronous resynchronization signal which causes the PLL to reset and resynchronize. For example, when the resynchronization signal is driven high, the PLL will reset its counters, clear its outputs, and lose lock. Once the resynchronization signal is driven low, the PLL lock process begins and the PLL will re-synchronize to the input reference clock. After the PLL re-locks, all output clocks will have the correct phase relationship.
FIG. 2 illustrates a conventional multi-link communications system 200 that includes a global asynchronous resynchronization signal (RESYNC_IN) and centralized resynchronization sampling point for distributing the sampled resynchronization signal to the various transmitter circuits within the communications system. Communications system 200 comprises a plurality of high-speed serial link transmitters such as transmitter 210. More than one transmitter can be grouped together in a core as previously described. For example, cores 202, 204 and 206 can comprise one or more high-speed serial transmitters. Additionally, each core includes timing circuitry adapted to control the internal operations of a particular core (e.g. transmitter timings) and for global synchronization between all channels in a multi-link communications system. For example, cores 202, 204 and 206 each contain a PLL such as PLL 220 for timing purposes. Communications system 200 includes a plurality of communication channels 240 for conveying serial data streams from a transmit side of the communications system to a receive side.
As described previously, the timing circuit of each core may require resynchronization for a variety of reasons. Some applications require multi-link configurations which group several links and cores into a bus, and impose limitations on the skew between these links. Due to the size and complexity of high-speed serial link designs (e.g., cores), such skew limitations can be very challenging when more than two or three cores are grouped together to form a multi-link communications system. Even with perfectly matched internal core timings, the uncertainty introduced when resynchronizing all core timing circuit can result in unacceptable skew.
Conventional multi-link communication systems such as system 200 include a global asynchronous resynchronization signal (RESYNC_IN) which is applied to the timing circuit of each core and serves to resynchronize each core. The RESYNC_IN signal is received, or captured, by each core. Double latches, such as double latch 230, are conventionally used to latch the RESYNC_IN signal because double latches provide metastability hardening. That is, double latch designs prevent unstable states caused by timing violations commonly associated with asynchronous signals (e.g. setup and/or hold time violations). Metastability manifests itself in a number of ways such as causing a latch to switch states, causing a latch not to switch states, causing a runt pulse, or causing oscillations at the output of a latch. The output of each double latch is transmitted to an n-input NAND gate 250 where n equals the number of double latches. The output of NAND gate 250 is the resynchronization signal (RESYNC) received by the timing circuitry in each core. RESYNC is used by each core to perform the timing resynchronization process previously described.
Multi-link communications systems such as the kind illustrated in FIG. 2 impose link-to-link skew limitations requiring precise control of the RESYNC signal assertion timing to each link, such that all links respond on the same reference clock cycle. If all links do not respond to the RESYNC signal on the same reference clock cycle, data integrity will be comprised. RESYNC signal skew, reference clock skew, and PLL static phase error variations introduced by multiple serial link cores may cause one or more of the links not to respond to the RESYNC signal on the same reference clock cycle. PLL static phase error is the time difference between the averaged input reference clock and the averaged feedback input signal when the PLL is in locked mode.
Known solutions to RESYNC signal skew, reference clock skew, and PLL static phase error variations are constrained by timing limitations resulting from the physical size and placement of the serial link cores involved, rendering them useful only in very minimal configurations. For example, as illustrated in FIG. 2, the conventional system 200 suffers from three timing constrained paths. First, NAND gate 250 provides the RESYNC signal to the cores only when the last (i.e. slowest) of the sampled resynchronization signals is received from all the double latches. Thus, the global resynchronization architecture is very sensitive to skew and core placement and must be precisely addressed during the design process, the solution of which is both layout and design-dependent. Second, skew associated with the RESYNC signal as it propagates from NAND gate 250 to the various cores creates further timing constraints and must also be precisely addressed during the design process, the solution of which is also layout and design-dependent. Third, as described previously, the core-to-core PLL static phase error difference.
Therefore, there exists a need in the art for a robust global timing resynchronization architecture and multi-link communications systems including the same that minimize the effects of resynchronization signal skew, reference clock skew, and PLL static phase error variations on resynchronization of multi-link communication systems.