In serial transmission systems operating at high bit rates over standard pc-boards or coaxial cables, data receivers may receive significantly distorted signals. Inter-symbol interference (ISI) generated by a limited bandwidth, reflections due to impedance mismatches and other limitations of the transmission media increase the probability of an erroneous recognition of a received bit. For these reasons, it becomes necessary to place, at the receiver input, a circuit to recover the signal before sending it to a re-sampler. Otherwise, the signal arriving at the sampler could be affected by amplitude reduction (vertical eye closure) and/or by timing jitter (horizontal eye closure), as depicted in FIG. 1.
Inside the receiver, a clock and data recovery block (CDR) has the function to reconstruct the clock timing for correctly re-sampling the received data ideally at the middle of the “eye.” However, horizontal (timing) and vertical (amplitude) degradation of the eye negatively affect the CDR capability of correctly recovering the incoming signal (bit). In fact, as a consequence of timing jitter and amplitude reduction suffered by the transmitted data pulse signal, the CDR is required to have an adequately enhanced precision in positioning the sampling clock at the center of the eye and while being sensitive to small amplitude signals.
A typical serial transmission chain is shown in FIG. 2. A linear equalizer is usually placed at the input of the receiver implementing a frequency transfer function to match the reverse of the transfer function of the transmission channel H(s). If such a match is achieved, the aperture of the eye is improved, both horizontally and vertically.
Upon increasing the operating frequency, the capability of such a linear equalizer acting as a high pass filter matching the reverse of the transfer function of the transmission channel may be inadequate to provide sufficient compensation of the channel frequency losses.
As a result, a different technique of equalization, known as decision feedback equalization (DFE), is implemented between the linear equalizer and the re-sampler. DFE may even completely substitute traditional linear equalization.
FIG. 3a shows an example of the degradation of a unitary pulse (a pulse whose amplitude is 1 volt and has a duration that is a 1 bit unit interval (UI)) caused by a limited bandwidth and other limitations of the transmission channel. The resulting pulse has a lower peak value and a longer duration. Considering the transmission channel as a linear system, a generic received signal can be seen as the superposition of individual pulses of positive or negative polarity, as shown in FIG. 3a, based on positive or negative bits being transmitted. An example superposition of pulse amplitudes at any sampling instant for a train of adjacent data pulses of a same amplitude and sign as received is shown in FIG. 3b. 
If we assume the receiver to be correctly sampling each bit of the received data pulse signal at its pulse peak (C0 or cursor value), postcursor amplitude values of pulse tails of the bits preceding the bit subject to sampling, as well as precursor amplitude values of successive bits as received, sum to the cursor value as an ISI contribution to the sampled amplitude of the incoming signal.
The known DFE technique is based on the principle that, because the previous data bits are known, their contributory effects in producing ISI on the incoming data bit may be determined and deleted by subtracting a quantity equal to the ISI that is produced on an incoming data bit.
A DFE uses sampled values (bn) and respective sampling errors (en) to estimate channel-dependent coefficients (ci) that multiply with the corresponding previous bits, and subtracts the results from the incoming data bit. An exemplary implementation of a DFE using four coefficients is shown in FIG. 4.
The value bn is provided by a comparator COMP1 that checks whether its input is positive or negative and produces a signal bn whose amplitude is set to +vth or −vth, according to the input signal polarity. A second comparator COMP2 compares the input and the output of the comparator COMP1 for providing error information to an estimator (LMS) of the coefficients ci. In a practical implementation, the comparator COMP1 may not be present because it can be seen as part of the sampling flip-flop FF1. In this case, for the generation of the sampling error information (en) the input and the output of the flip-flop FF1 can be directly monitored by any circuit adapted to perform the logical function of the comparator COMP2. Typically, Least Mean Squares (LMS) algorithms are employed to estimate the coefficients ci and find the best set of coefficients ci that minimizes the mean square error en between the value of the expected bits (+/− a certain threshold vth) and the received bits.
Whether a single estimated coefficient is used (simplest implementation with a single correction tap) or several coefficients are used (more refined implementation with several correction taps) for enhanced ISI deletion, to ensure correct behavior of a DFE circuit in terms of data recovery, a first or unique correction by the first (c1) of the estimated coefficients is to be effected before sampling the next bit. To satisfy this requirement, the DFE feedback path for the first or unique estimated coefficient c1 cannot have a signal propagation delay greater than the bit period (Tbit). Usually the propagation delay is smaller than the bit period. Often, receivers use a half rate clock, where the expression half-rate means that the frequency of the clock that generally is recovered from the incoming data bit stream is half that of the bit-rate of the transmitted data pulse signal, and both rising and falling edges are utilized to sample the incoming data.
Since the DFE corrects the incoming bit on account of the ISI of a single previous bit or of several previous bits, a DFE implementation as shown in FIG. 4 would necessarily be a full-rate system.
The DFE can be adapted to a half-rate clocking scheme of the receiver by using a multiplexer that selects which of the two samples (the data sampled by the rising clock edge and the one sampled by the falling clock edge) has to be alternately used as a previous bit (precursor bit) to be multiplied by the ci coefficient before being subtracted from the input bit (cursor bit), as with the exemplary circuit of FIG. 5.
The flip-flops FF1 and FF3 provide a sampled value of their input at the rising edge of the clock, while the flip-flops FF2 and FF4 provide a sampled value of their input at the falling edge of the clock. The multiplexers (2 1) select their input 1 on the high level of the clock, and their input 2 on the low level of the clock.
In this description, the clock ck of the multiplexers has been depicted as being the same clock of the flip-flops. However, it is possible to have a difference between the clock of the multiplexers and the clock of the flip-flops without changing the basic concept.
To reduce the propagation delay of the first DFE correction tap c1, the circuit implementation of FIG. 5 may be modified as shown in FIG. 6. The data to be multiplexed is provided to sign_C1 from master latches LATCH1 and LATCH2 of the flip-flops FF1 and FF2. Eventually, the data is further amplified by a buffer stage LIMITING.
Applying the same concept described above for the sign_C1, the timing path for sign_C3 can be improved according to the architecture shown in FIG. 7, where the C3 tap multiplexer input data are the outputs of the master latches (LATCH5 and LATCH6) of the flip-flops FF3 and FF4 of the traditional DFE architecture of FIG. 5.
Because the data L5out and L6out come from a cascade of three regenerative latch stages, the amplifying stages LIMITING before the multiplexer inputs are not required, though they could nevertheless be added. This implementation can be generalized to any number of DFE taps just by adding a same number of pairs of latches in the shift register and respective multiplexers.
The use of a clocked DFE, with either a full-rate or a half-rate recovered clock signal, simplifies synchronization of previous-bit correction to the incoming bit. However, this implies that the propagation delays of the flip-flops (of the latches that compose them) and eventually of the multiplexers contribute to the first tap overall feedback delay.
Alternative techniques for implementing FIR filters without using a synchronization clock are well known and are used in high frequency applications. For example, reference is directed to the techniques disclosed in the article by H. Wu, J. Terno et al., “Differential 4-tap and 7-tap Transverse Filters in SiGe for 10 Gb/s Multimode Fiber Optic Link Equalization”, IEEE ISSCC dig. of tech. papers, February 2003.
DFEs that include a FIR filter not synchronized by a clock in the feedback path is depicted in FIG. 8, for example, to illustrate a single tap DFE. A delay element of a nominal time value corresponding to one bit is inserted in the feedback path. This ensures the correct timing of the correction of the ISI superposed to the bit to be sampled in the received pulse signal.
Published patent application U.S. 2006/0239341 discloses a DFE in which the feedback signal has a continuous time waveform, and is obtained using a filter in the feedback path having a transfer function representing the reciprocal of the transfer function of the transmission channel. The alternative for a DFE operating in a continuous time domain is compatible both in a DFE synchronized by a clock, as well as for a DFE not synchronized by a clock.