Synchronization in data communication is often composed of several levels, with the highest level(s) involving methods like correlation and the lowest levels involving clock and data recovery (CDR). The lowest levels of synchronization occur first and often dictate the quality of synchronization available at the highest levels; thus it is desirable to ensure high quality and efficient data synchronization at the lowest levels.
Clock generation and distribution is a significant challenge in the design of large and complex digital systems. Ordinarily, clock generation and distribution can be well controlled in small systems and subsystems. For instance, synchronous signaling, where the data signal timing is related to a single timing reference (i.e., global reference), can be used for practically all critical high-speed signals. FIG. 1 illustrates a basic conventional topology for a synchronous system.
Synchronous clocking systems, such as the Synchronous Optical Network (SONET), use a single timing model. Originally developed for transmission of telecommunications signals such as voice, SONET is now the prevalent transport infrastructure for the wide area network (WAN) backbone. The primary benefit of the synchronous transmission and multiplexing hierarchy defined by SONET, is that multiple data streams at the defined rates can be combined (multiplexed), without bit stuffing into a higher rate stream, and can be extracted without demultiplexing the entire higher rate stream. Although the SONET method works well and has been in use for large backbone telecom networks, it is an expensive and complex system.
FIG. 2 illustrates a conventional topology for a plesiochronous system, where, unlike the synchronous system, each subsystem is designed to have its own local clock generation and distribution. A plesiochronous system is defined as one where the local clocks operate at approximately the same frequency such that the difference in frequency between any two subsystems is bounded to a small difference. For example, Infiniband™ networks are designed such that the local references are within +/−100 ppm of the ideal timing reference. To accommodate the differences in data periods due to the frequency differences, the conventional plesiochronous system employs bit stuffing techniques, where special bits are either added or deleted to adjust the rate of an incoming data stream to the frequency of the system receiving the data stream. With continued reference to FIG. 2, digital subsystem 1 transmits at a frequency of Fc1. Subsystem 2 has a clock rate of Fc2 which is close to the frequency of Fc1, but not exact. In order for subsystem 2 to use the data at its clock rate of Fc2, frequency compensation must be performed on the Fc1 data stream. Thus, the intra-system interfaces require data transmitted synchronously from one subsystem, to be retimed or synchronized to the local reference in the receiving subsystem.
FIG. 3 illustrates a conventional data synchronization architecture for each subsystem in the plesiochronous system of FIG. 2. As shown, a basic function of the subsystem is to retime the received data and provide a clock (local reference) aligned to that data for further digital processing.
Phase locked loops (PLLs) and delayed locked loops (DLLs) are common systems used in the I/O interfaces of data communication systems. In these applications, the PLL and DLL closely track the input clock and help to improve overall system timing. However, the rising demand for high-speed I/O has created an increasingly noisy environment in which the PLL and DLL must function. Noise tends to cause the output clocks of the PLL and DLL to jitter from their ideal timing. With a shrinking tolerance for jitter in the decreasing period of the output clock, the design of low jitter PLLs and DLLs has become challenging. To reduce PLL jitter, the loop bandwidth should be set as high as possible. Unfortunately, design tradeoffs often constrain the PLL bandwidth to be well below the lowest operating frequency for stability reasons. These constraints can cause the PLL to have a narrow operating frequency range and poor jitter performance. Although a typical DLL is based on a delay line and, thus is simpler from a control perspective, it can have a limited delay range which leads to a set of problems similar to that of the PLL.
One attempt at improving clock and data recovery in a plesiochronous system is illustrated in FIG. 4. A clock-data recovery (CDR) 400 includes a dual loop configuration having a PLL 402 and a DLL 406. PLL 402 is configured in a conventional manner having a phase frequency detector (PFD) 403, a loop filter 412, and a voltage controlled oscillator (VCO) 414. A local reference is detected at PFD 403 and filtered through loop filter 412. Typically, loop filter 412 is configured as a wideband loop for suppressing the VCO phase noise below the loop bandwidth. VCO 414 is configured to generate an oscillating signal at a frequency proportional to the local reference by using a frequency divider (not shown) at the input of PFD 403.
DLL 406 includes a phase detector 407, which receives the incoming data, a digital loop filter 408, and a phase shifter 409. Digital loop filter 408 may be configured as a wideband loop to track input jitter. Phase shifter 409 may be, for example, an infinite range phase shifter, typically implemented as a multi-phase selector, and provides an input to decision circuit 410. Phase shifter 409 provides a variable phase shift of the phase shifter input such that a clock may be generated having phase and frequency components that can be varied relative to the VCO output. Decision circuit 410 receives the output of phase shifter 409 and provides an output consisting of retimed data. Generally, decision circuit 410 includes a high speed comparator or D-flip-flop that allows detection of small amplitude signals and regenerates the signals to normal amplitude by reclocking the input.
While this dual loop configuration may offer some advantages over the single loop systems, for example, individual loop optimization, the DLL bandwidth must be very wide to accommodate the input jitter. In CDR systems, it is often desirable to have a wide bandwidth; however, in other applications, this is not always the case. For example, a wide bandwidth CDR is not desirable if the output clock jitter must be kept low. The output clock jitter's relationship to the input clock jitter is the jitter transfer function. A low bandwidth jitter transfer function typically allows a lower clock jitter to be generated. This is because the high frequency jitter is reduced by the lowpass filtering properties of the jitter transfer function.
If several retiming operations occur in the CDR system, it is possible for a substantial amount of jitter to be introduced if the jitter transfer function of the retimer is such that the output jitter exceeds the input jitter. This typically occurs in a PLL based CDR due to peaking in the jitter transfer function caused by the second order nature of the system, i.e., two integrators; one in the PLL loop filter and one in the VCO. DLL based CDRs, such as CDR 400, generally exhibit little or no jitter peaking because they are first order systems. However, CDR 400 sustains a performance tradeoff in selection of the loop bandwidth, i.e., optimizing the jitter tolerance versus optimizing the jitter transfer.
Accordingly, an improved system and method for data synchronization in a plesiochronous system is needed. Specifically, a system and method for improved data serialization and retiming having minimum jitter generation (e.g., wide loop bandwidth) and maximum jitter suppression (e.g., narrow loop bandwidth) is desired. In addition, a plesiochronous system and method for data recovery and retiming is needed which does not require bit stuffing.