Over the last decade, engineers have been steadily increasing the data rate of high speed serial I/O implementations across various industry standards to satisfy the growing demands in enterprise and consumer applications. The current products have been targeting 8-10 Giga Bit Per Second data rates. Receiver clock and data recovery architectures at such data rates are often complex and challenging. Architectures are evolving to improve both performance and power. There are two categories of prior art receiver architectures that are widely used in industry, clock and data recovery with Alexander phase detection and double rate sampling as illustrated in Prior Art FIG. 1, and symbol rate timing recovery methods using a Mueller-Müller principle as illustrated in Prior Art FIG. 2.
The receiver architecture utilizing Alexander type timing recovery implements double rate sampling as illustrated in FIG. 1. It typically consists of high-gain, high-bandwidth CTLE (continuous time linear equalizer amplifier), two phase interpolators (PI), data and edge samplers, and digital CDR (clock data recovery) blocks. Differential input signals RXP and RXN are amplified by CTLE and its output (outp and outn) drives both the data samplers and edge samplers. Two phase interpolators mix PLL input clocks and generate in-phase clock (clki) for data sampler to quantize CTLE outputs in the center of the unit interval (UI), and quadrature clock (clkq) for edge sampler to quantize CTLE outputs at the transition instance of the unit interval (UI). Matching clock buffers (CLKBUF) are put in to build clock trees for clki and clkq respectively. Samplers typically use both rising and falling edges of the clock to capture CTLE outputs, in which d0 is the even bit data sample at clki rising edge, d1 is the odd bit data sample at clki falling edge, e0 is the even bit edge sample at clkq rising edge, and e1 is the odd bit edge sample at clkq falling edge. Data samplers further generate a digital clock (elk) to bundle with d0, d1, e0, e1 to drive the downstream logic. The digital CDR block formulates Alexander phase detection using data and edge samples, and the phase error output drives a digital loop filter. The loop filter outputs are encoded into phase interpolator DAC (digital to analog converter) control codes (pi_dac1 and pi_dac2) to adjust the clock phase of the two phase interpolators respectively. Receiver architectures utilizing Alexander type timing recovery have been extensively used in the industry, however it faces growing design challenges when the data rate keeps increasing, as the 2× sampling requirement of the Alexander phase detection becomes less power efficient.
Symbol rate timing recovery using Mueller-Müller principles has been gaining more attraction in recent serial I/O receiver architectures. Typically such receivers consist of high-gain, high-bandwidth CTLE (continuous time linear equalizer amplifier) with automatic gain control (AGC), a DFE (decision feedback equalizer, either voltage or current integration type) with data and error-samplers, and clock-data-recovery circuits based on a Mueller-Müller algorithm.
The Mueller-Müller CDR algorithm requires sampling per unit-interval (UI), compared to double sampling (center and edge of the UI) required by “data-edge” based CDRs and therefore needs fewer number of samplers and is more area and power efficient. Such an architecture using CTLE with AGC, DFE, Mueller-Müller CDR algorithm, and Least-mean-square (LMS) based optimization is demonstrated in FIG. 2. The differential input signals RXP and RXN are amplified by CTLE and its output (outp and outn) drives the decision feedback equalizer (DFE). A single phase interpolator mixes PLL input clocks and generates in-phase clock output (clki). Clki is distributed through a clock tree network (CLKBUF) for samplers to quantize equalizer outputs (vxp and vxn) into data and error bits. The samplers further generate a digital clock (elk) to bundle with data and error for the downstream logic. The digital CDR block formulates Mueller-Müller algorithm using data and error bits, and the phase error output drives a digital loop filter. The loop filter outputs are encoded into phase interpolator DAC control codes (pi_dac1) to adjust the clock phase of the phase interpolator. In the meanwhile, the digital LMS block utilizes the same data and error bits to adjust the gain of CTLE by control signal AGCCoef and the equalization by control signal DFECoef. The optimization mechanism is designed to minimize the mean squared voltage error of the DFE differential output (vxp-vxn) against the reference voltage level of the error samplers. In summary, both the Mueller-Müller CDR Loop and AGC-DFE loop converge concurrently based on equalized samples that come out of the DFE, resulting in CDR and DFE loop interactions. This architecture may be referred to as “equalized MM CDR”. The inherent loop interactions can negatively impact the performance and stability of this equalized MM CDR receiver architecture.