FIG. 1 provides a high-level abstraction of a portion of a computer server or system, where microprocessor 102 resides on board 104 and communicates with memory 106 on board 108. The communication is by way of striplines on backplane 110. Backplane 110 is connected to boards 104 and 108 by connectors 112. Not shown in FIG. 1 are other memory units and microprocessors, where the various microprocessors and memory units may communicate to one another so as to access or write data and instructions.
Communication of signals over backplane 110 may be modeled by transmission line theory. Often, the signaling is based upon differential signaling, whereby a single bit of information is represented by a differential voltage. For example, FIG. 2a shows drivers 202 and 204 driving transmission lines 206 and 208, respectively. For differential signaling, drivers 202 and 204 drive their respective transmission lines to complementary voltages. Typical curves for the node voltages at nodes n1 and n2 for a bit transition are provided in FIG. 2b, where the bit transition is indicated by a dashed vertical line crossing the time axis. The information content is provided by the difference in the two node voltages.
For short-haul communication, such as for the computer server in FIG. 1, the signal-to-noise ratio is relatively large. If the transmission lines are linear, time-invariant systems having a bandwidth significantly greater than that of the transmitted signal, then a relatively simple receiver architecture may be employed to recover the transmitted data. Such a receiver is abstracted by comparator 210, which provides a logic signal in response to the difference in the two received voltages at ports 212 and 214.
However, every transmission line has a finite bandwidth, and for signal bandwidths that are comparable to or exceed the transmission line (channel) bandwidth, intersymbol interference may present a problem. Furthermore, actual transmission lines may have dispersion, whereby different spectral portions of a signal travel at different speeds. This may result in pulse spreading, again leading to intersymbol interference. As a practical example, for high data rates such as 10 Gbps (Giga bits per second), the transmission lines used with backplanes or motherboards are such that intersymbol interference is present.
Channel equalization is a method in which one or more filters are employed to equalize the channel to help mitigate intersymbol interference. These filters may be sampled-data (discrete-time) filters, where a time index t is a discrete variable, or they may be continuous-time filters, where the time index is a continuous variable. Many channel equalizers are realized by a Finite Impulse Response (FIR) filter employed at the receiver. A FIR is a sampled-data filter. FIG. 3 is an abstraction of a FIR filter structure, where x(t) is the input (received) signal to the FIR filter and z(t) is the filtered output. In the case of differential signaling, x(t) and z(t) may be viewed as representing differential signals. The filter impulse response is represented by a n dimensional vector denoted as h, where the filter weights are the components of this vector and are denoted by [ h]i, i=0,1, . . . , n−1. Multipliers 302 and summer 304 may be realized in the analog or digital domain. For the embodiments described in these letters patent, multipliers 302 and summer 304 are realized in the analog domain.
In many adaptive equalization schemes, the filter vector is updated during a training time interval, and then remains fixed for some period of time. During training, a known sequence is transmitted over a communication channel to the receiver, and the filter vector is synthesized during the training time interval. Many algorithms have been developed to synthesize the filter vector. The well-known LMS (Least Mean Square) algorithm is an iterative technique based upon the method of steepest descent (gradient) to minimize a squared error.
The LMS algorithm may be written as the following iterative procedure performed during the training time interval: h(t+1)= h(t)+μ[Kd(t)−z(t)] x(t),where x(t) is a n dimensional received data vector with components given by [ x(t)]i=x(t−i) for i=0, 1, . . . , n−1, μ is a positive weight determining the filter “memory” or “window size” and may be viewed as the step-size in the steepest descent method, d(t) represents the known transmitted data during the training time interval (the training sequence), and K is a positive scale factor. The above iteration is performed during a training time interval t=1, . . . , T, where an initial h(0) is chosen and at the end of the training time interval, the filter vector h is set equal to h(T), i.e., h= h(T).
In many analog implementations the filter weights assume discrete values limited to some fixed range, and the scale factor K takes into account this finite range of the filter weights as well as practical implementations of the filtering. For example, for bipolar differential signaling the known training sequence d(t) may take on either of the values VCC or −VCC, where VCC is a supply voltage, but in practice the filtered output z(t) is always in magnitude less than the supply voltage. In this case, K<1.
To simplify the computations needed to synthesize the filter vector, the so-called sign-sign-LMS algorithm has been used: h(t+1)= h(t)+μsgn{[Kd(t)−z(t)]}sgn{ x(t)},where sgn{ } is the sign function. The sign-sign-LMS algorithm, although widely used due to its simplicity of implementation, has been found to have some undesirable properties when used in adaptive high-speed equalizers with relatively low to moderate length words (e.g., four to six bits) for the filter weights. Because the probability of filter weight update during adaptation is high, there is a significant amount of residual noise in the filter weights, even after convergence of the algorithm. This residual noise may be reduced by choosing a longer window (smaller μ), but this increases the convergence time, or in other words, the training time interval. Another disadvantage found in many instances is that the converged filter weights are relatively sensitive to the scale factor K, whose optimum value has been found to be difficult to determine.