Data transmission across high-speed chip-to-chip interconnects may take a number of forms. One example of a data transmission system 10 between high-speed components within a single semiconductor device or between two devices on a printed circuit board is represented in FIG. 1. In FIG. 1, a transmitter 12 (e.g., a microprocessor) sends data over one or more transmission channels 14a-14c (e.g., copper traces “on-chip” in a semiconductor device or on a printed circuit board) to a receiver 16 (e.g., another microprocessor or memory). Such transmission channels 14a-14c are referred to, for example, as “data buses,” which allow one or more data signals to be transmitted from one device to another. Ideally, when a data signal is sent from a transmitter 12 to a receiver 16 across a channel 14, all of the energy in a transmitted pulse will be contained within a single time cell, which is often referred to as a unit interval (UI).
However, for a number of reasons, data signals are not received exactly as they were transmitted. While an ideal data signal may comprise a logic ‘1’ (“high”) value or a logic ‘0’ (“low”) value, a real data signal may become altered by the time it is detected at the receiver 16. Often, this is a result of effects of the channel in which the data signals are sent. Thus, real transmitters and real transmission channels do not exhibit ideal characteristics, and the effects of transmission channels are becoming increasingly important in high-speed circuit design. Due to a number of factors, including, for example, the limited conductivity of copper traces, the dielectric medium of the printed circuit board (PCB), and the discontinuities introduced by vias, the initially well-defined digital pulse will tend to spread or disperse as it passes over the transmission path.
For example, the use of multiple channels 14a-14c as shown in FIG. 1 may cause undesirable noise to be transferred from one data signal to another in the system 10 due to capacitive or inductive coupling between the channels 14a-14c, in a phenomenon referred to as crosstalk. Even when only a single channel 14 is present in a system 10, a transmitted signal may be distorted due to capacitive or inductive effects. In multi-channel systems 10, crosstalk occurs when transitioning data induces either a voltage (inductive crosstalk) or a current (capacitive crosstalk) on a neighboring line. Crosstalk from neighboring channels may alter the amplitude and timing characteristics of a bit of interest on a given channel. Crosstalk is most often addressed with careful channel routing techniques, which may include the placement of additional traces between the channels to provide shielding and to reduce inter-channel coupling.
Another phenomenon leading to the distortion of data bits on a channel is dispersion, which results from non-uniform group delay or other bandwidth limitations on a channel 14. This phenomenon results in the spreading of the energy of a pulse beyond the boundaries of the pulse UI, which results in energy from bits preceding or following a bit of interest in the bit sequence affecting the amplitude and/or timing of the bit of interest. This phenomenon is referred to as intersymbol interference (ISI) and is typically addressed through channel equalization. By either preceding or following the transmission channel with a frequency dependent circuit, whose transfer characteristics are the inverse of the channel characteristics, the original signal behavior may be restored.
Dispersion of a pulse is shown in FIG. 2A, where a single pulse of data 15a is sent by the transmitter 12 during a given UI (e.g., UI3). However, because of the effect of the channel 14, this data pulse becomes spread 15b over multiple UIs at the receiver 16, i.e., some portion of the energy of the pulse is observed outside of the UI in which the pulse was sent (e.g., in UI2 and UI4). This residual energy outside of the UI of interest (ISI) may thus perturb a pulse otherwise occupying either of the neighboring UIs.
ISI is shown more succinctly in the simulation of FIG. 2B, where two ideal pulses, π1 and π2, each occupy their own adjacent unit intervals. The resulting dispersed pulses, P1 and P2, represent simulated received versions of the ideal pulses after transmission at 10 Gb/s through a 6-inch copper trace in a standard printed circuit board material (FR4). The dispersion in each of these pulses overlaps the other pulse, as shown by the hatched portions in the drawings, which represent ISI. The larger pulse, P3, represents the waveform that results when P1 and P2 are sent across the same channel with no intermediate delay, which is a common occurrence in the standard non-return-to-zero (NRZ) signaling format.
From the perspective of the receiver 16, one tool for quickly analyzing the effects of ISI and other noise on the signal is the eye diagram. An eye diagram is a plot that superimposes or overlays multiple data symbols from a data sequence. This provides a clear picture of how the data signal will change over time, and it also aids in determining the available margin for correct determination of the original digital state of each transmitted bit (i.e., that each transmitted bit is properly interpreted as a logic ‘1’ value or a logic ‘0’ value). When the eye closes, for example, due to reduced signal margins, the available data capture window shrinks and the probability of incorrectly interpreting the digital value of the received bit increases.
In high-speed systems, the ISI built up across the channel may be exacerbated or amplified in the receiver, if the receiver input buffer itself is bandwidth-limited or is intolerant to process variation. Thus, the technique of capturing the incoming data immediately as it enters the receiving chip, before it is passed through any circuitry, has been shown to provide the most margin for error in terms of the data capture mechanism, and as a result is becoming more commonplace in high performance systems.
To capture the incoming data in this manner, a sense-amplifier is commonly used, which allows the data entering the receiving chip to be immediately compared with a reference voltage (Vref), at a point in time corresponding to a trigger from an associated clock edge (sample clock). Depending upon the receiver characteristics, this methodology can be extremely tolerant to amplitude noise- and timing jitter-induced data eye closure (i.e., the shrinkage of the data capture window). However, this method is sensitive to the relative position of Vref and the sample clock edge (phase relationship between clock and data transitions) with respect to the opening of the data eye.
To reduce the probability of error, systems have begun to “train” Vref and the relative phase of the sampling clock edge in order to center the sample point (intersection of Vref and sample phase) within the capture window. Such training, which typically occurs during system startup, but may be repeated periodically throughout the operation of the system, may consist of interaction between the transmitter and receiver, or it may contained within the receiver, thus simplifying the interconnect. Such training may be carried out on a channel to channel basis (each receiver being trained independently), or the training may take place on a single channel with the resulting settings applied to several parallel receiver circuits to reduce area and power costs associated with instantiating several replicas of the training circuitry. In cases where only one channel is trained, and the resulting settings are applied to multiple channels, some additional receiver sensing margin is lost due to the channel-to-channel distinction inherent in real systems. In high speed systems, where the margins should be maximized, independent training of each channel is becoming more common. This can be done either by replicating the training circuitry at each receiver and performing the training of all channels simultaneously, or a single training circuit may be used, and time multiplexed between the various channels to train each channel, one at a time.
Trainability of Vref implies that the magnitude of Vref is controllable. This typically requires Vref to be generated from a digital-to-analog converter (DAC), which can be set to output a specified voltage level on an analog signal based on a digital input. Similarly, trainability of the sample clock timing or sample phase relative to the data edge requires control over the clock propagation delay. This is typically accomplished through the employment of variable delay-lines (VDLs), which may or may not require the additional incorporation of a delay-locked loop (DLL) or a phase-locked loop (PLL). Further resolution in the sample phase setting is accomplished through phase interpolation circuits, which are also often controlled digitally.
Thus, both Vref and the sample phase may be controlled digitally, and the range of each parameter may comprise several steps in voltage (Vref) or timing (sample phase), with the resolution of each step limited only by the level of complexity deemed appropriate for the system. The circuits required for these training operations, DACs, VDLs, DLLs, and PLLs, are well understood by one skilled in the art and are becoming more common in high performance systems. Thus, the circuitry itself is not considered a limiting factor when training Vref and the sample phase. Further, alternative methods for training, which may not require specific circuitry discussed here, are also possible.
One method for training Vref and the sample phase to determine an optimal sampling point is discussed with reference to FIG. 3A, which combines several cycles of data into an eye diagram, as discussed previously. It should be noted that the following descriptions are all discussed in terms of eye diagrams, which tend to imply that all of the information contained in the data eye is present at the outset of the training. On the contrary, the sampling of the data, as described throughout this specification, may be applied to real-time data and therefore information regarding the incoming signal is obtained gradually, and only by the end of the training sequence is all of the eye diagram information available. In FIG. 3A, an optimal sample point 28 is determined by maximizing the voltage margin (represented by the arrows 30) in the eye 22. This is done, in effect, by “painting” the eye, which comprises sampling the received signal as follows.
Essentially, at each available phase step, the error-free Vref range is determined by incrementing the Vref level, and at each incremented Vref level sampling the data for a certain number of cycles. The number of errors is computed for each Vref setting, at each phase step, and the error-free range is determined by counting the number of sequential Vref settings for which no errors were detected. The error-free range is then computed at each phase step, and from that data, the phase step which resulted in the largest error-free range is considered to provide the greatest voltage margin, (i.e., a distance from Vref to an error in the eye). As a result, the corresponding phase setting is adopted for real time operation and the Vref level is set to the midpoint of the corresponding error-free range (the setting for which an equal number of settings are above and below in the error-free range). At this point, the training is complete.
One shortcoming with this approach is that determining the optimal sampling point 28 by maximizing the voltage margin may result in offsetting the phase of the sampling point 28 from the optimal sampling phase, in this case the midpoint reference time 24. This is a common occurrence when maximizing the voltage margin, as the maximum voltage margin does not necessarily coincide with the maximum timing margin. Another shortcoming of this approach is the number of training cycles required by this method (i.e., the number of sampled cycles multiplied by the number of testable Vref/sample phase coordinates). In addition, the amount of data that must be stored throughout the training process can be problematic. At the very least, the process requires storing in memory the error-free range associated with each sample phase step. Further, in some implementations, this method may require storing the error count computed at each Vref/sample phase coordinate until the training is complete.
Another method used to determine an optimal sampling point 34 is shown in FIG. 3B. In FIG. 3B, the concept of “painting” the eye is more clearly illustrated. The inside of the eye 22 is “painted” with several identically-sized squares, with each square corresponding to an independent Vref/sample phase setting combination, which is sometimes referred to herein as a coordinate. The inner opening of the data eye is determined in the manner just described, with regard to FIG. 3A, but in this case, the optimal sampling point 34 is chosen to correspond to the most “central” coordinate (each being represented in the figure as a square). This is determined, for example, by locating a Vref/sample phase setting for which an equal number of error-free voltage settings are above and below, and for which an equal number of error-free phase settings are before and after in time (i.e., for each sample phase setting). Such a Vref/sample phase setting constitutes the optimal sampling point. After this point is determined, the training is complete.
Although determination of the optimal sampling point 34 according to FIG. 3B provides a more accurate sampling point than simply taking the widest voltage margin, as it also gives consideration to the timing margin, this method is computationally expensive, requiring numerous calculations to paint the eye 22 and determine the central sampling point 34, and further requires storing the error count at each testable Vref/sample phase coordinate, or at least storing the Vref and sample phase settings associated with each coordinate along the inner eye boundary, until the training is completed.
As was noted previously, the implementation of the above training algorithms takes place in the presence of real time data. In other words, data transitions are not guaranteed and the state of the data is not known in advance, though a replica of the training pattern may be stored in the receiving circuitry to simplify the process. It should also be pointed out that the term “optimal” is subjective. No method identifies a single “optimal” sampling point (i.e., a combined Vref and sample phase coordinate) in terms of providing for the lowest probability of error for all instances of a received signal. Rather, each method determines the best sampling point for a given amount of information obtainable by the receiver system.
Clearly, circuit designers of multi-Gigabit systems face a number of challenges as advances in technology mandate increased performance in high-speed systems. Correct detection of such high-speed signals becomes difficult as data rates and physical constraints on transmission circuits increase. Accordingly, an improved technique for determining an optimal sampling point for the data capture process would do so in a computationally efficient manner. The disclosed techniques achieve such results in a manner implemented in a typical computerized system or other circuit package.