In modern integrated circuit (IC) chip designs, multiple clocks may be desired or required in order to provide various operating clock rates within an IC chip. Allowing the IC chip to operate at various operating clock rates can achieve better performance and compatibility while working with a variety of peripheral devices. A typical personal computer chipset provides at least four types of clock inputs, for example, CPU clock, DRAM clock, PCI clock, and AGP clock, to control the respective circuit blocks. These clocks allow the chipset to communicate with host bus, DRAM bus, PCI bus, AGP bus, and others. Moreover, depending on their functionality, each bus can work at different clock rates to achieve better performance.
When control information and data transfer between different circuit blocks are necessary, the signals crossing through several clock domains must be handled before they are used if clock rates related to these circuit blocks are different. In other words, the signals from a source clock domain must be resynchronized to the clock of a destination clock domain by a synchronizer. However, since the latency of a typical synchronizer is one or two cycles depending on the relationship between the source and the destination clock domains, it is difficult to attain a reliable performance. On the other hand, the signals may bypass the synchronizer if the source and the destination clock domains have the same clock rate and phase. In that case, the extra one or two cycles can be saved and better performance can thus be achieved. For instance, in a typical personal computer, the clock rate of host bus is close to the clock rate of DRAM bus. These two clock rates may even be the same in some configurations. In bypassing the synchronizer, the latency caused by the synchronizer can be avoided.
In a traditional computer system, different clocks are from different output pins of a clock generator which is another IC chip on the computer motherboard. Hence, there is a considerable skew between these clocks. As depicted in FIG. 1, clock generator 10 provides 3 source clocks CLK1, CLK2 and CLK3, by way of buffers 20, to 3 different circuit blocks. For example, circuit block 30a is responsible for functions associated with host bus, circuit block 30b is responsible for functions associated with DRAM bus and circuit block 30c provides functions associated with PCI bus. Once clock signals pass through respective phase locked loop (PLL) unit 34 of each circuit block then propagate to flip-flops 38 via clock trees 36, clock skews will occur as a result. In designing an IC chip, it is contemplated to minimize such clock skews as much as possible. The clock skew caused by clock trees 36 is generally constant, therefore, it can be minimized by adjusting the phase of the clock signal. Moreover, for current computer chipsets, the clock rates of host bus and DRAM bus are 100 or 133 MHz. As a result, there are four combinations for the system clock configurations, two in synchronous operation mode: 100 MHz/100 MHz, 133 MHz/133 MHz, and two in asynchronous operation mode: 100 MHz/133 MHz, 133 MHz/100 MHz. Most computer chipsets support these two operation modes. As described above, a computer system working in synchronous operation mode may achieve better performance. Nevertheless, as the clock rates of computer systems increase dramatically, the data may no longer be valid if there are clock skews within the IC chips and systems. With reference to FIG. 1, the clock of circuit block 30a and the clock of circuit block 30b are received from CLK1 and CLK2 respectively. For an intrinsic clock skew between CLK1 and CLK2, circuit blocks 30a, 30b cannot work at the highest clock rates of the synchronous operation mode in order to avoid data errors caused by the clock skew.
Accordingly, what is desired is a method and apparatus for reducing clock skew between different circuit blocks of an IC chip when the chip works at a synchronous operation mode.