The present invention relates to managing clock skew when separate clock domains are employed to process data through a critical path.
In conventional systems, a system clock signal is often used by digital circuitry, such as digital circuitry implemented using a LSI circuit, to synchronously execute certain logic functions. For example, ultra-deep sub-micron (UDSM) microprocessors employ digital circuitry that use system clock signals to synchronously execute logic functions. These microprocessors operate at system clock frequencies of 1 GHz and higher. The system clock signal of a given LSI circuit is often split into many paths to service many different portions of the digital circuitry.
As digital circuits are becoming more complicated, the use of multiple clock frequencies within the circuit is becoming more common. This permits, for example, one portion of the circuit to operate at a lower frequency, thereby reducing power dissipation in the circuit. This may require, however, that data be transferred through a clock domain boundary, i.e., from one portion of circuitry operating at a first frequency to another portion of circuitry operating at a second frequency.
Ideally, the system clock signals at different portions of the digital circuitry exhibit exactly the same timing characteristics so that the different portions of the digital circuitry operate in exact synchronization, even when different clock frequencies are employed in different portions of the circuit. In practice, however, the system clock signals at various points throughout the digital circuitry exhibit differing timing characteristics, such as differing rising and/or falling edges (i.e., transitions), differing duty cycles, and/or differing frequencies. These non-ideal characteristics are often referred to as clock jitter and clock skew.
Clock jitter relates to the inaccuracies inherent in generating the system clock signal. The non-ideal characteristics of the system clock signals due to clock jitter affect all portions of the LSI circuit in the same way, irrespective of how the system clock signals are distributed to those portions of the circuit. Clock skew relates to the inaccuracies introduced into the system clock signals by the distribution technique employed to split the system clock into many paths and deliver the clock signals to different portions of the digital circuit.
Sources of clock skew may be classified as being statically occurring or dynamically occurring. Statically occurring sources of clock skew are caused by the LSI design or manufacturing process irrespective of the operating conditions of the LSI circuit. Dynamically occurring sources of clock skew are caused by the operating conditions of the LSI circuit, which may also be functions of the LSI circuit design or manufacturing process.
Statically occurring sources of clock skew include (i) variations in transistor load capacitance (e.g., gate load capacitance); (ii) RC delay of circuit interconnections (e.g., the asymmetry of wire lengths and widths); (iii) variations and/or asymmetries in cross-coupling capacitance between wires (e.g., inter-wiring capacitance); and (iv) semiconductor process variations (e.g., transistor threshold voltage variations, transistor ON resistance variations, wiring variations, via, and contact RC variations).
Dynamically occurring sources of clock skew include (i) cross-coupling between wire lengths due to inter-wiring capacitance; (ii) cross-coupling between wire lengths due to inductive coupling; (iii) cross-coupling due to return path current; (iv) temperature variations; and (v) variations in VDD and VSS (e.g., DC operating voltage variations).
Unfortunately, the variations in the timing characteristics of the system clock signals due to clock skew result in undesirable errors in the operation of the digital circuitry of the LSI circuit. The problem is exacerbated when transitions through clock domain boundaries are encountered.
The above difficulties due to clock skew will now be discussed in more detail with reference to FIGS. 1 and 2A-B. FIG. 1 is a block diagram of a digital system 10 employing clock domain boundaries between respective stages of combinational logic. The system 10 includes a plurality of full latch circuits 12, 16, 20 and a plurality of combinational logic circuits 14, 18. (For the purposes of the present discussion, the delay circuits 22, 24 are assumed not to be within the system 10.) The full latch circuit 12 is operable to transfer data into a first stage of combinational logic 14, while the full latch circuit 16 is operable to transfer data into a second stage of combinational logic 18. The full latch circuit 12 is clocked utilizing a clock A operating at a first frequency (e.g., 4 GHz). The full latch circuit 16 is clocked utilizing clock B operating at a second frequency (e.g., 2 GHz). Thus, the full latch circuit 16 establishes a clock domain boundary between the first stage of combinational logic 14 and the second stage of combinational logic 18, which stages operate at different frequencies.
It is understood that the system 10 may include further stages of combinational logic that are not shown for the purposes of brevity and clarity.
FIG. 2A is a graph illustrating the timing characteristics of the clock A signals and the clock B signals as they relate to the availability and transfer of valid data between the first and second stages of combinational logic 14, 18. Initial reference will be made to the clock A signal and the clock B (synch) signal, where the clock B (synch) signal represents the ideal case where the rising edges of the clock A signal and the clock B (synch) signal are exactly aligned. On the rising edge 50 of the clock A signal, the full latch circuit 12 clocks data into the combinational logic 14. The shortest propagation delay Tp exists from the rising edge 50 of the clock A signal to the onset of valid output data A0 from the combinational logic 14. The data A0 is valid until the shortest propagation delay Tp following a next rising edge 52 of the clock A signal.
At the rising edge 54 of the clock B (synch) signal, the full latch circuit 16 will transfer the data at its input (which may include data A0) to the combinational logic 18. The set-up time Ts for the full latch circuit 16 represents an amount of time prior to the rising edge 54 of the clock B (synch) signal during which the data A0 must be valid at the input of the full latch circuit 16 in order for proper transfer to occur. The hold time Th represents an amount of time following the rising edge 54 of the clock B (synch) signal during which the output data A0 must remain valid at the input of the full latch circuit 16 for proper transfer to occur.
Since there is always some minimal propagation delay Tp following the rising edges 50, 52 of the clock A signal before the validity of the data in the first stage of combinational logic 14 changes, there is typically no hold time violation when the clock B signal is synchronized with the clock A signal.
A similar analysis obtains when considering the data transfer from the second stage of combinational logic 18 to a next stage represented by latch 20. In particular, the rising edge 56 of the clock B (synch) signal clocks the data at the input of the full latch circuit 16 into the combinational logic 18. After a shortest propagation delay Tp following the rising edge 56, the output data B0 of the combinational logic 18 becomes valid at the input of the full latch circuit 20.
At the rising edge 58 of the clock A signal, the full latch circuit 20 will transfer the data at its input (which may include data B0) to the combinational logic of the next stage (not shown). The set-up time Ts for the full latch circuit 20 represents an amount of time prior to the rising edge 58 of the clock A signal during which the output data B0 must be valid at the input of the full latch circuit 20 in order for proper transfer to occur. The hold time Th represents an amount of time following the rising edge 58 of the clock A signal during which the output data B0 must remain valid at the input of the full latch circuit 20 for proper transfer to occur.
Referring now to the clock B (lag) signal, it is assumed that the clock B signal lags the clock A signal due to clock skew problems. Under this scenario, the lagging rising edge 54A of the clock B (lag) signal occurs a significant time later than the rising edge 54 of the clock B (synch) signal. When this lag is of significant magnitude, the output data A0 may not be valid for a significant length of time to meet the hold time Th following the lagging rising edge 54A. Consequently, the valid output data A0 may never be transferred into the second stage of combinational logic 18. Indeed, although the set-up time Ts prior to the rising edge 54A of the full latch circuit 16 may be satisfied, the validity of the output data A0 expires prior to the hold time Th following the rising edge 54A. Consequently, the full latch circuit 16 cannot transfer the valid output data A0 from the combinational logic 14 to the combinational logic 18.
With respect to the transfer of the output data B0 from the combinational logic 18 to a next stage, the clock B (lag) signal actually assists in providing an additional amount of time in which the output data B0 is valid following the expiration of the hold time Th following the rising edge 58 of the clock A signal. Notably, however, the lagging nature of the clock B (lag) signal reduces the set-up time—an amount of time in which the output data B0 is valid prior to the rising edge 58. In this example, this reduction will likely not violate the set-up time Ts requirement prior to the rising edge 58 for the full latch circuit 20.
Turning now to the details of the clock B (lead) signal, it is assumed that the rising edges of the clock B (lead) signal occur before the rising edges of the clock A signal. As to the output data A0, the leading nature of the clock B (lead) signal assists in providing hold time—a period of time in which the output data A0 is valid following the rising edge 54B of the clock B (lead) signal. Thus, the leading nature of the clock B (lead) signal would not appear to cause hold time problems as to the transfer of the output A0 from the first stage of combinational logic circuit 14 to the second state of combinational logic 18. Notably, however, the leading nature of the clock B (lead) signal reduces the set-up time—an amount of time in which the output data A0 is valid prior to the rising edge 54B of the clock B (lead) signal. Thus, it is possible that set-up time problems might occur depending on the severity of the leading characteristics of the clock B (lead) signal.
The leading rising edge 56B of the clock B (lead) signal occurs a significant time before the rising edge 56 of the clock B (synch) signal. When this lead is of significant magnitude, the output data B0 may not be valid for a significant length of time to meet the hold time Th following the rising edge 58 of the clock A signal. Consequently, the valid output data B0 may never be transferred into the next stage of combinational logic. Indeed, although the set-up time Ts prior to the rising edge 58 of the full latch circuit 20 may be satisfied, the validity of the output data B0 expires prior to the hold time Th following the rising edge 58 of the clock A signal. Consequently, the full latch circuit 20 cannot transfer the valid output data B0 from the combinational logic 18 to the next stage.
A conventional solution to the hold time violation problems resulting from the clock skew phenomenon as between the clock A and clock B signals is to employ the delay circuits 22, 24 somewhere in the respective data paths of the combinational logic 14 and the combinational logic 18. For the purposes of discussion, the delay circuits 22, 24 are illustrated as being inserted just prior to the respective full latch circuits 16, 20. With reference to FIG. 2B, the introduction of the delay circuit 22 delays the propagation of the output data A0 by an amount Td following the propagation delay Tp from the rising edge 50 of the clock A signal. This insures that the output data A0 is valid for an additional time period, Td, following the rising edge 54A of the clock B (lag) signal. Thus, the hold time Th for the full latch circuit 16 may be met and the output data A0 may be transferred to the second stage of combinational logic 18.
The introduction of the delay circuit 24 prior to the full latch circuit 20 introduces an amount of delay Td following the propagation delay Tp measured from the rising edge 56A of the clock B (lag) signal before the output data B0 from the combinational logic 18 is valid. Consequently, the amount of time margin as between the onset of the valid output data B0 to the beginning of the set-up time Ts before the rising edge 58 of the clock A signal is significantly reduced.
When the clock B signal leads the clock A signal, the effect of the delay circuit 22 on the validity of the output data A0 tends to cause set-up time Ts violations as to the full latch circuit 16 measured with respect to the rising edge 54B of the clock B (lead) signal. The leading characteristics of the clock B (lead) signal may tend to improve the set-up time characteristics as to the validity of the output data B0 with respect to the full latch circuit 20 and the rising edge 58 of the clock A signal, but the delay introduced by the delay circuit 24 negates such improvements. The introduction of the delay circuit 24 may increase the amount of time that the output data B0 is valid following the rising edge 58 of the clock A signal, thereby satisfying the hold time Th for the full latch circuit 20.
Although the use of delay circuits to address hold time problems may be useful, there is a corresponding increase in the likelihood of set-up time violations. Further, they consume a significant amount of the overall timing budget of the critical paths through the respective stages of combinational logic. Theoretically, the total amount of delay introduced into a particular stage due to the delay circuits and the clock skew should be two times the maximum clock skew. This may significantly increase the occurrence of set-up time violations and also significantly reduce the timing margins of the overall system.