The present invention is related to device input/output delay calibration and signals alignment in a source synchronous, delay adjustable hardware system.
Source-Synchronous clocking refers to the technique of sourcing a clock along with the data. Specifically, the timing of unidirectional data signals is referenced to a clock (often called the strobe) sourced by the same device that generates those signals. In receiving device, the data is sampled by the accompanying clock edge. To correctly sample the input data, clock edge must fall at the clear-open eye of the data signal.
Some devices such as certain Analog Devices digital-to-analog converters assume the parallel data signals have overlapped opening-eye time period, and data are also aligned with clock signal. That is, the clock edge is also the transition time for data, while the high or low level of clock is aligned with opening-eye of data signals. The solution tunes the clock delay only to have the clock edge fall at the overlapped data window; no special data alignment procedure is needed.
In high-speed cases, the clear-open eye is relatively small. Due to I/O delay variation, the clock edge may fall at the data transition period (i.e., changing from 1 to 0 or 0 to 1) which may result in incorrect sampling, or received signal with high bit error ratio (BER). Moreover, trace routing length difference also add uncertainty to this problem. Even though the signal length can be controlled during PCB layout, imposing strict rules will make layout more difficult.
Some devices with high-speed interface provide I/O with tunable delay, so that the window can be adjusted to let the clock edge sample correctly. This feature makes data alignment possible, but proper alignment is still a problem. Due to I/O delay and trace length uncertainty, even if each data can be adjusted to have the clock edge fall at the center of its open-eye, in some signals one or more bits can be aligned in different windows which is usually not acceptable.
When high-speed I/O is connected to a deserializer, where the deserializer provides bit-level slipping function for word alignment, if the inputs are not aligned to exact the same window, the deserializer output may also result in one word misalignment.
FIG. 1 shows elements of an example device input interface. LVDS (low-voltage differential signal) inputs RX_P and RX_N are coupled to LVDS receiver 102, which outputs positive signal D+ and negative signal D−. The two output signals from 102 are connected to tunable input delay blocks 104 and 106 respectively. The output from each delay block is coupled to an ISerDes (input serializer/deserializer) 108 or 110, which is essentially flip-flops plus control logic, to function as a demultiplexers (DeMux).
The tunable delay blocks 104 and 106 delay their corresponding inputs to a configurable number of taps, so that the signal can be adjusted to have center of its open-eye moved to the clock edge, to guarantee correct signal detection.
ISerDes has clock inputs clk and clk_div, to trigger input serial data, and latch parallel output, respectively. Bitslip signal is provided in each ISerDes to slip the N-bit parallel output for different alignment. For example, by generating a bitslip pulse, for input sequence “a, b, c, d, e, f, g, h, . . . ”, with 1:4 ISerDes, possible aligning modes are shown in FIG. 2.
With example input interface elements mentioned above, one solution for misalignment problem uses a training sequence, such as 4′b1001 in the 1:4 ISerDes case of FIG. 2. This solution includes a data window centering process, and a parallel output word alignment. Data window centering is based on the fact that with given training sequence, the output is expected to be stable in case the clock edge falls at the open-eye. The data window centering process further contains three steps. First the process looks for the first transition tap and pass through this period. The principle is to compare the current parallel output with the older one after increasing the delay by one tap. If different, it is in the transition tap. The process keeps on increasing the taps to reach a stable window where no bit changes during a given period. Second the process searches for the end of open-eye with same approach, and third the process returns to the center of the open-eye from the knowledge of the searched beginning and ending tap.
Parallel output word alignment is achieved by tuning the bitslip signal (FIG. 1) to get an expected output pattern, for example 4′b1001 for the above mentioned training sequence. Each signal in the parallel input group is tuned with the above mentioned procedure with the same expected output to get aligned in both bit and word level.
One problem using the method described above is it may result in one or more bits delay. One example is the timing diagram shown in FIG. 3. Two input signals a and b in 310 have accompanied DDR clock. Data signals a and b have transition period such as periods 304, 306, 308, and open-eye periods such as period 312. Conventional systems would first perform searches for first transition period, which, when starts from edge 302, the first detected transition period for signal a will be period 306, while for signal b will be period 308. The delay tuning results in misalignment in 320, where i-th bit from signal a is aligned with (i−1)-th bit from signal b. Bit-level misalignment may result in word-level misalignment in the subsequent step, where there might be one word different for signals after ISerDes, even though the output pattern from training sequence input is the same.