In several of today's large-scale Integrated Circuits (ICs), a single clock signal is required at numerous nodes that are physically separated over large distances. The parameters used in measuring the clock signal quality when a clock signal is transmitted over large distances are:
a. Maximum frequency of operation;
b. Duty cycle variation;
c. Noise injection into the substrate;
d. Sensitivity to substrate and VDD/GND noise;
e. Matching (skew, etc) between the clock signals at several “leaf” nodes of the clock signals; and
f. Jitter
In addition, the “clock tree” may also be required to multiply a high-quality, low frequency signal to generate a very high frequency clock signal. The Clock tree can be defined as a circuit that distributes a single clock source to multiple destinations or “loads”. In addition the clock tree may also multiply or divide the frequency of the reference clock source. FIG. 1 shows a typical clock tree. A Phase-Locked Loop (PLL), which is well-known in the art, is employed in such applications as part of the clock tree.
The conventional method of building a clock tree is to derive a high-quality clock signal from an external crystal and use it as a reference signal for the PLL, as shown in FIG. 1. This reference signal is multiplied by the PLL 110, resulting in a higher frequency CMOS signal (buffered, as required, by buffer 120). The clock tree (shown as a sequence of N loads 130, 131, 132 . . . 13N, the PLL 110 and BUF 120) is employed to transport this signal over large distances using at least N CMOS buffers 140, 141, 142 . . . 14N, respectively.
This technique is quite adequate if the clock is distributed over relatively smaller areas and the frequency of the clock signal is relatively low (less than, say 1 GHz). The limiting factors and problems associated with this technique, when applied to very high frequencies (greater than about 1 GHz) and/or over large distances, are as follows:
Effect of routing inductance/T-line effect;
Timing jitter due to VDD/GND and substrate;
Noise coupling to/from other routed nets;
Noise injection due to high-frequency and high-power drivers;
Sensitivity to VDD/GND/Substrate noise;
Duty-cycle degradation; and
Power consumption
Referring to the scheme 100 of FIG. 1, it can be observed that a single node is driving the clock signal to all of the loading cells (131, 132, . . . 13N). As a result, this single driver must be capable of driving a very large load at a very high frequency. This is problematic because such a driver would introduce significant noise into the power supply and into the silicon substrate, which will corrupt the signals of any adjacent circuits.
Another issue in integration of high speed circuitry is illustrated by the circuit 200 of FIG. 2. This circuit 200 combines three different functions into one functional block, listed as follows:
a) Two-to-One multiplexer 210;
b) Level-shifter 220; and
c) Buffer 230.
Circuit 200 is a typical implementation of all these functions. One of the two CMOS-level data signals (D1 and D2) is output by the MUX 210 depending on the selector control signal SEL. The level shifter 220 converts the CMOS signal to a low-voltage analog signal. Finally, the BUF cell (analog buffer) 230 generates a low voltage differential signal, OUTP and OUTN, capable of driving a large load. A circuit such as circuit 200 requires many CMOS transistors and could introduce more noise into the power supply and silicon substrate. This noise can propagate to other circuits in the vicinity of this circuit.
FIG. 3 shows a generic flip-flop very commonly used in the industry. This flip-flop 300 uses transmission gates, two for each stage of the flip-flop. The total number of clocked transistors in this scheme is eight and are relatively bigger in size. The two clock inverters 393 and 394 driving these eight transistors need to be big enough to be able to drive these transistors with an acceptable and relatively short rise and fall time.
Circuit 300 shows Data (D) and Scan Data (SD) inputs coupled to inverters 301 and 302, respectively. Components 301, 302, 310 and 315 make a Multiplexer circuit (MUX). Depending on the logical value of input SE (Logic 1 or 0), either input D with inverter 301 and transmission gate 310 is selected; or input SD with inverter 302 and transmission gate 315 is selected. Inverters 301 and 302 feed the transmission gates 310 and 315 respectively, which are triggered by clocks CKB and CK (coming from the reference clock signal CLK, i.e. outputs of inverters 393 and 394 respectively. CKB is an inverted clock version of reference clock CLK, and CK is the same as CLK with a steeper rise time and with a delay equal to delay through the two clock inverters 393 and 394). Each transmission gate 310 and 315 is constructed with a pair of CMOS transistors coupled source to source and drain to drain. Transmission gates 310 and 315 are in ON state i.e., the current can go through them, when the reference clock CLK is low (or at logic level 0), and are in OFF state when CLK is high (or at logic level 1). The output of the these transmission gates is sent to the first of two latches.
The first latch consists of inverter 320, inverter 340, and a transmission gate 325. Inverters 320 and 340 are in back-to-back configuration through the transmission gate 325. When the clock CLK is low (or at logic level 0), transmission gate 325 is in OFF state and the latch is in “load” mode. When the clock is high (or at logic level 1), transmission gate 325 is in ON state and the latch “stores” data.
Latch one feeds inverter 330 which acts as a driver for latch two through the transmission gate 350. Transmission gate 350 is in ON state when clock CLK is high.
The output of the transmission gate 350 feeds into latch two and the final inverter driver 360 for output Q. Latch two consists of inverters 370 and 390, and a transmission gate 380. This transmission gate 380 is in ON state when Clock CLK is low (or at logic level 0). Hence the latch is in store mode when clock CLK is low (logic level 0), and in the load mode when clock CLK is high (logic level 1).
The overall operation of the flop 300 is as follows: Data from input D or SD is selected depending on value of SE. If SE is logic level 1, input from SD is selected; and if SE is logic level 0, input D is selected. When the clock CLK is low (logic level 0), transmission gates, 310, 315, and 380 are ON; and transmission gates 325 and 350 are OFF. When clock CLK is low (logic 0), data is loaded into the flop through inverters 301, 320, and 330. When clock CLK goes high (logic level 1), transmission gates, 310, 315, and 380 are OFF, and transmission gates 325 and 350 are ON. Data is stored in latch one and is also captured at the output through inverter 360.
Such circuits utilize many individual component devices and when used in high frequency and high speed signal applications, are noisy and consume much power and thus, are not suitable for such applications.