1. Technical Field
The present invention relates to equalization techniques for high-speed data communications that involve Decision Feedback Equalizers (DFE) and more specifically to implementations of speculative DFEs using CMOS circuits.
2. Description of the Related Art
High-speed data interconnects used in modern computing systems and data communication routers currently operate at data rates that usually exceed the bandwidth of a physical channel used for data transmission. Therefore, such communications require the use of channel equalization, i.e., compensation for signal distortions caused by finite channel bandwidth. These distortions are known as Inter-Symbol Interference (ISI).
The most common technique used for equalization of high-loss channels (e.g., 20-30 dB high frequency attenuation) is known as a decision-feedback equalizer (DFE). The critical advantage of DFE over regular linear filters is its ability to flatten the channel response (and hence reduce signal distortion) without amplifying noise or crosstalk.
In a DFE, the previously received bits are fed back with weighted tap coefficients and added to the received input signal using circuits known as summing amplifiers. If the magnitudes and polarities of the tap weights are properly adjusted to match the channel characteristics, the ISI from the previous bits in the data stream will be cancelled, and the bits can be detected by a data slicer (a circuit that determines whether a signal is above or below a given threshold) with a low bit error rate (BER). The adjustment of the tap weights can be performed either manually or automatically by an appropriate adaptive algorithm.
A major challenge in the design of a DFE operating at a very high data rates (multiple gigabits per second) is ensuring that the feedback signals have sufficiently low latency to allow the slicer input to settle accurately before the next data decision is made. If a full-rate DFE architecture is used, the feedback loop delay (including the decision-making time of the slicer and the analog settling time of the DFE summing amplifiers) needs to be less than one data unit interval (UI), i.e. less than one period of a full-rate clock. If one switches to a half-rate architecture (with associated doubling of the clock period to 2 UI), the requirement to circuit latency stays the same, i.e., it is not relaxed, as there is still only one UI available to establish the feedback from the previously detected bit, weighted by the first tap coefficient (denoted as h1).
A common technique used to relax the latency requirement of a DFE is known as speculation or loop unrolling. In this approach, both +h1 and −h1 tap weights are added to the input signal using two identical summing amplifiers. Since (for binary data transmission) the previous bit can only have two different values, one of these DC offsets added to the input signal represents the correct compensation of the ISI due to the previous bit. The outputs of the two summing amplifiers are applied to two identical slicers to produce two tentative data decisions. Once the previous bit is known, the data decision corresponding to correct polarity of hi compensation is selected with a 2:1 multiplexer (MUX).
Since the h1 compensation is implemented as multiple DC offsets (static taps) instead of a dynamically changing feedback signal, analog settling time requirements for the first DFE feedback tap are eliminated, while the next tap (h2) can have 2 UIs of latency, i.e., the maximum latency limit was doubled. Note that the bit controlling the MUX must still arrive within 1 UI, but this latency requirement is “digital”, i.e., it does not involve analog settling processes that require high accuracy. Therefore, the speculative DFE technique replaces a critical analog loop with 1 UI latency (h1 loop) with a combination of an analog loop with 2 UI latency (h2 loop) and a digital loop with 1 UI latency (MUX select loop), which is substantially easier to satisfy.
The reason for particular attention to the latency of DFE feedback loops is that in many designs this latency becomes the primary limitation on the maximum clock speed of the overall circuit. This is an important issue with DFE circuits implemented with CMOS logic as opposed to more conventional current-mode logic (CML), because while CMOS circuits operate with clock speeds comparable to CML and provide substantial savings in power and area, they have generally higher latency than CML parts.
To reduce the latency of DFE feedback loops designers try to minimize the total number of stages within the loop. The state-of-the-art CML DFE implementations use a three-stage CML circuit to convert the analog signal from the output of the summers to a valid digital DFE output. For example, in a recently proposed speculative DFE, a set of summers is followed by the following three stages of CML circuits: a master latch, a 2:1 multiplexer and a slave latch. However, even a three-stage topology poses substantial challenges for its efficient implementation with CMOS logic, and therefore even a lower-latency two-stage implementation of this function in CMOS logic is highly desirable.