1. Technical Field
The present teaching relates to method and system for analog circuits. More specifically, the present teaching relates to method and system for clock edge synchronization and systems incorporating the same.
2. Discussion of Technical Background
With the improved performance of CMOS processes over the past decade, digital circuitry is increasingly prevalent in modern electronics. Most electrical systems rely on clocks for different tasks such as conversion of analog signals to digital data, accessing and processing digital data, and converting digital data back to analog signals. Because of increasing circuit complexity and higher operating frequencies, stringent requirements are placed on the precision of clock signals in such systems to insure both accurate data conversion and proper data access and transmission.
Requirements on clocks used to access and process digital data differ from requirements on clocks used for data conversion. Synchronization of clocks for the former situation is referred to as “data window synchronization” and clock synchronization for the latter situation is termed “clock edge synchronization” in the present teaching. In data window synchronization, where the data is already in digital format, insuring adequate set-up/hold times across multiple flip-flops to ensure simultaneous access of this data often dominates the design challenges. An example of data window synchronization is described in U.S. Pat. No. 6,774,823, entitled “Clock Synchronization Logic,” assigned to Analog Devices, Incorporated. At the chip level, where propagation delays can often be kept to less than the clock period, careful design of the clock tree and digital data path delays can achieve data window synchronization. However, with ever increasing circuit size and, hence increasing propagation delays, careful design may no longer be sufficient. At the system level, where data may be transmitted over optical networks, other techniques must be employed to insure data integrity.
Clocks used to sample the analog inputs of multiple analog-to-digital converters (ADC's) may require sampling edges that are precisely synchronized so that all ADC inputs are sampled simultaneously. In this case, the absolute time at which the sampling operation takes place is often not critical. What is essential is the timing of the first and all subsequent clock edges used for sampling are precisely synchronized. That is, to achieve simultaneous sampling at all ADC inputs, the first and all subsequent clock edges must be time aligned, or clock edge synchronized. In the following disclosure, unless specifically noted, further references to clock synchronization refer to clock edge synchronization.
Just as careful clock tree and data path design usually allows data window synchronization at the chip level, the same is also true for achieving clock edge synchronization of a single clock distribution chip. A conventional circuit 100 that allows, at the chip level, clock edge synchronization of multiple clocks is shown in FIG. 1 (Prior Art). The circuit 100 takes a synchronization pulse SYNC (105) and a clock signal CLKI (110) as inputs and produces two synchronized output clock signals, CLKO_A (170) and CLKO_B (180). Circuit 100 includes serially connected flip-flops, FF1 (120), FF2 (130) and FF3 (140), as well as two clock distribution blocks 150 and 160, which generate output clock signals CLKO_A (170) and CLKO_B (180), respectively.
In operation, when the synchronization pulse SYNC (105) asserts, CLKO_A (170) and CLKO_B (180) are both forced to a known state. After SYNC (105) de-asserts, both clocks resume transitioning, but with their rising edges synchronized. This is illustrated in the timing diagram shown in FIG. 2 (Prior Art), in which signal 200 represents the clock signal 110, signal 210 represents the synchronization pulse 105, signal 220 represents the output signal from the last flip-flop 140, and signals 230 and 240 represent the synchronized clock outputs CLKO_A (170) and CLKO_B (180), respectively. Because the SYNC pulse (105) is asynchronous with respect to CLKI (110), flip-flops FF1-FF3 (120, 130 and 140) are included to retime SYNC (105) with respect to CLKI (110) to avoid metastability. As shown in FIG. 2, the retimed output of the three flip-flops, SYNC_INT (220), forces both CLKO_A (230) and CLKO_B (240) low when SYNC_INT (220) transitions high. When SYNC_INT (220) transitions low, output clock signals CLKO_A (230) and CLKO_B (240) resume clocking with edges aligned.
For this prior art approach to work, matching the delays of the parallel connections of the CLKI (110) and SYNC_INT inputs to the two CLKDIST blocks (150 and 160) is critical. In addition, insuring appropriate set-up/hold time margins within the CLKDIST blocks (150 and 160) is required, so that CLKO_A (170) and CLKO_B (180) can both resume clocking synchronously.
If an application requires more than two synchronized clock outputs, a prior art solution connects multiple chips, constructed the same way as circuit 100, in parallel. For example, in an application requiring four synchronized clock outputs, FIG. 3 (Prior Art) shows an implementation where two chips, CHIPA (310) and CHIPB (350), connect in parallel with respect to input signals SYNC (300) and CLKI (305). CHIPA (310) includes three serially connected flip-flops, FF1 (315), FF2 (320), and FF3 (325), which retime input signal SYNC (300) with respect to CLKI (305). The retimed output of the three flip-flops, SYNC_INT_A, is then fed to two clock distribution blocks, 330 and 335, which generate output clock signals CLKO_AA (340) and CLKO_BA (345) of CHIPA (310), respectively.
Similarly, CHIPB (350) includes three serially connected flip-flops, FF1 (355), FF2 (360), and FF3 (365), which also retime input signal SYNC (300) with respect to CLKI (305). The retimed output of the three flip-flops, SYNC_INT_B, is then fed to two clock distribution blocks, 370 and 375, which generate output clock signals CLKO_AB (380) and CLKO_BB (385) of CHIPB (350), respectively. Although a straightforward arrangement, the implementation shown in FIG. 3 may not produce four clock edge synchronized outputs. For instance, even though the clock outputs CLKO_AA (340) and CLKO_BA (345) of CHIPA (310) are synchronized, they may not be clock edge synchronized with the clock outputs CLKO_AB (380) and CLKO_BB (385) of CHIPB (350).
The timing diagram shown in FIG. 4 (Prior Art) helps clarify the operation of the circuit in FIG. 3. In FIG. 4, signal 410 represents the clock input 305 and signal 420 represents the synchronization input pulse 300. Furthermore, signal 430 represents an internal signal in CHIPA, the output of flip-flop 325, and similarly, signal 460 represents an internal signal in CHIPB, the output of flip-flop 365. In addition, signals 440 and 450 represent the synchronized clock outputs of CHIPA, CLKO_AA (340) and CLKO_BA (345), respectively. Similarly, signals 470 and 480 represent the synchronized clock outputs of CHIPB, CLKO_AB (380) and CLKO_BB (385), respectively.
FIG. 4 illustrates the problem of using the implementation in FIG. 3 to achieve clock edge synchronization of four outputs. Specifically, the internal synchronization signal for CHIPA, SYNC_INT_A (430), transitions one CLKI (410) cycle earlier than the similar signal for CHIPB, SYNC_INT_B (460). As a result, the output clock signals controlled by SYNC_INT_A (430), CLKO_AA (440) and CLKO_BA (450), are not clock edge synchronized with those controlled by SYNC_INT_B (460), CLKO_AB (470) and CLKO_BB (480). The one cycle difference in transition time between SYNC_INT_A (430) and SYNC_INT_B (460) may have several causes. One possibility is trace length mismatches of the SYNC (300) input lines to CHIPA (310) and CHIPB (350) on the printed circuit board (PCB). A mismatch where the SYNC (300) line to CHIPB (350) is slightly longer than the line to CHIPA (310) would cause the SYNC (300) input to arrive at CHIPB (350) slightly later than it arrives at CHIPA (310), and therefore SYNC_INT_B (460) would transition one cycle later than SYNC_INT_A (430). When the challenge of precisely matching PCB trace lengths is coupled with the fact that the SYNC signal (420) is asynchronous with respect to CLKI (410), it becomes virtually impossible to ensure that the SYNC (420) pulse will be latched on the same clock edge by both CHIPA (310) and CHIPB (350), unless an additional sub-circuit is added.
FIG. 5 (Prior Art) shows another prior art circuit in which a retiming circuit 507 has been added to the circuit in FIG. 3. In FIG. 5, circuits in CHIPA (510) and CHIPB (550) are constructed similarly to those in FIG. 3. Circuit 507 retimes the SYNC pulse (500) with respect to the clock input, CLKI (505), and produces the synchronization inputs to both CHIPA (510) and CHIPB (550). The addition of circuit 507 insures that SYNCA (508) and SYNCB (509) are synchronous with respect to CLKI (505). Therefore, the addition of circuit 507 solves the problem of the synchronization inputs to CHIPA (510) and CHIPB (550) being asynchronous with respect to CLKI (505); however, extreme care still must be taken to match the traces going to CHIPA (510) and CHIPB (550). For example, the traces SYNCA (508) and SYNCB (509), which go from the output of circuit 507 to the inputs of CHIPA (510) and CHIPB (550), respectively, need to be nearly identical lengths so flip-flops 515 and 555 will latch the synchronization input on the same cycle. In addition, matching the lengths of the CLKI signals (505) to both CHIPA (510) and CHIPB (550) is equally critical.
U.S. Pat. No. 7,382,844, entitled “Methods to Self-Synchronize Clocks on Multiple Chips in a System,” discloses another prior art solution for clock edge synchronization of multiple integrated circuits. This is shown in FIG. 6 (Prior Art). In circuit 600, one chip, Chip A (610), is designated as the master chip and the others as slave chip(s). The master chip stores a calibration macro 620, which enables a calibration sequence to measure the roundtrip delay between the master and each slave chip.
From the delay measurement(s) acquired during the calibration, the master chip determines an appropriate delay, denoted as D (630) in FIG. 6, for each “slave” chip. Through such calibration, a delay is programmed on the master chip with respect to each slave chip so that future synchronization pulses sent to a slave chip are appropriately delayed to ensure that the “time zero” clock edge of all chips is synchronized. That is, for every slave device, the master chip must determine and store a delay. Thus, the master chip requires a circuit to measure the roundtrip delay, a variable delay element, and a bidirectional tri-state driver. In addition, the master also needs to include digital circuitry to perform the calibration sequence. Furthermore, each slave chip must have circuitry to participate in the calibration sequence and communicate with the master chip in order for the master to measure the delay. Therefore, this prior art approach requires significant additional circuitry to both the master and the slave chips.