As the clock frequency increases at which integrated circuits operate, the clock period decreases such that there is less time available to accommodate integrated circuit trace propagation delays in the clock signal. A high frequency clock signal is typically generated by a clock generation circuit using a low frequency crystal as a reference clock signal. The clock generation circuit includes a frequency synthesizer to produce the high frequency clock signal output. The high frequency clock signal is routed through traces on an integrated circuit to devices such as a cache controller, processors, and random access memories. It is desirable to have clock signals arrive at all devices at precisely controlled times, which may be or may not be simultaneous. The devices receiving the clock signal are located at various distances from the clock generation circuit resulting in traces of different length over which the clock signal must propagate.
Differences in clock signal arrival time at various devices due to propagation delays is often referred to as clock skew. An excessive clock skew among clocked gates can cause asynchronous data transfers and produce unpredictable results, leading to the failure of a device. While clock skew can be reduced but typically not eliminated by integrated circuit layout, it is more desirable to lay out an integrated circuit efficiently to package as many components as possible into a given area. Thus, concerns over clock signal propagation delays must be addressed in another manner.
The clock skew in an integrated circuit device is usually composed of two parts, namely, mismatch in resistive-capacitive (RC) delays along the various paths of the clock distribution wires and mismatch in the clock buffer delays along the paths. Generally, it is relatively easy to separately match either the clock buffer delays or the RC delays. However, since the wire resistance and capacitance (RC delay components) vary differently from the gate transconductance and the parasitic diode capacitance (clock buffer delay components) under various processing technologies and operating conditions, matching both components together is not an easy task. Furthermore, since the RC delay values depend on the physical layout of the device, an integrated circuit designer can only guarantee the minimum clock skew requirement by tuning the RC delay along the clock tree once the physical design (layout) stage is essentially complete. In fact, in spite of all the tuning work, the minimum clock skew is best guaranteed for only a narrow operation range.
Recently, integrated circuit (IC) manufacturers have begun producing single chips containing multiple device cores, such as multiple memory devices, micro-controllers, microprocessors and digital signal processors (DSPs), that were traditionally mounted on a PCB and interconnected by one or more busses on the PCB. Such a single chip is commonly referred to as a system-on-a-chip (SoC). SoCs incorporate one or more busses to provide data paths to interconnect the multiple core devices on the chip, often referred to as “nodes,” and utilize a global clock to synchronize the operations of the various nodes. The clock skew problem is more prominent in case of an SoC device where the RC delays on different clock branches can differ by more than an order of magnitude due to a wide range of clock wire lengths.
A number of techniques have been proposed or suggested for clock signal arrival time at various devices on a chip. FIG. 1 illustrates a first conventional technique where the clock skew is minimized by physically matching the clock wire length of each branch 110-1, 110-2 of the distribution network 120 for a global clock 105. While the wire length matching technique illustrated in FIG. 1 effectively reduces the clock skew, the technique only balances the delays attributed to RC components among the different clock branches 110-1, 110-2. In addition, whenever there is a modification to the layout, there must be a corresponding modification to layout of the clock tree 120, thereby extending the design time.
FIG. 2 illustrates another conventional technique for reducing the clock skew by balancing the clock buffer delay. A reference clock (REF-CK) signal generated by a reference block 205 is applied to the phase locked loop/delay locked loop (PLL/DLL) 220-n of each block 210-n along with the feedback clock (FB-CK) to control the PLL clock (PCK) delay through the PLL/DLL 220-n. The clock signal produced by the PLL/DLL 220-n synchronizes the data output from Block-1 210-1 through the data buffer 230-n with the data output from the Reference-block 205. Clock skew is minimized by matching the clock buffer delay in each block 210-n using clock buffers 240-n. The size of each buffer 240-n is fixed once the layout is established. For a more detailed description of the clock buffer delay matching technique, see, for example, Mark Johnson and Edwin Hudson, “A Variable Delay Line PLL for CPU-Coprocessor Synchronization,” IEEE J. of Solid State Circuits, Vol. 23, No. 5 (October 1988). While the clock buffer matching technique illustrated in FIG. 2 effectively reduces the clock skew, the technique only balances the delays attributed to clock buffer delay components and ignores the RC components. If there is a substantial RC delay on the REF-CK signal line in FIG. 2 from the reference-block 205 to block-1 (210-1), the I/O signals from these two blocks would not synchronize.
FIG. 3 discloses another clock skew reduction technique that assigns a particular phase A, B, C of a multi-phase ring oscillator 300 to the input of each clock driver 310-n based on the estimated clock wire RC delay from each clock driver 310-n to the destination module (not shown). The assignment of a particular phase A, B, C to each clock driver 310-n is done such that the phase difference among different clock drivers 310-n are equal to the differences among the RC delays on the clock wires which are driven by the same group of clock drivers. For a more detailed discussion of this clock skew reduction technique, see, U.S. Pat. No. 5,268,656 issued to Muscavage, incorporated by reference herein. FIG. 4 illustrates a timing diagram of an implementation of the circuit shown in FIG. 3. While the clock skew reduction technique illustrated in FIG. 3 effectively reduces the clock skew, the technique only balances the delays attributed to RC components.
A need therefore exists for improved techniques for reducing clock skew that address both the wire RC delays and the clock buffer delays. A further need exists for a self-synchronized clock distribution network that uses a remote clock feedback. Yet another need exists for an automatic clock skew control scheme that inserts an appropriate delay on the output of a clock generator such that the arrival times of the clock signal at each node may be coordinated.