1. Field of the Invention
The present invention relates to the timing of clock and data signals in integrated circuits. More specifically, the invention relates to simultaneous transmission of digital data and clock signals to eliminate skewing of the data and clock signals with respect to each other.
2. State of the Art
Digital integrated circuits typically include multiple logic elements, with the timing of operation of each logic element controlled by a clock signal. It is common for an integrated circuit chip to have one central clock generator, with the signal from the clock generator being distributed around the integrated circuit via clock-line interconnects. An important consideration in the design of digital integrated circuits is the timing of the arrival of clock and data signals at various logic elements.
Variation in clock signal arrival time is referred to as clock skew. A variety of techniques have been used to provide clock connections that are symmetrical and all of the same length in order to minimize clock skew at the various logic elements, including, for example, the methods of Yip and Carrig. See, K. Yip, “Clock tree distribution: balance is essential for a deep-submicron ASIC design to flourish,” IEEE Potentials, vol. 16, no. 2, pp. 11-14, April-May 1997; and K. M. Carrig et al., “Clock methodology for high-performance microprocessors,” Proc. Custom Integrated Circuits Conference, Santa Clara, Calif., May 5-8, pp. 119-122, 1997. A number of prior art approaches are illustrated in FIGS. 1A-1D.
FIG. 1A illustrates an H-tree clock-distribution, which is used primarily in custom layouts and has varying tree interconnect segment widths to balance skew throughout the chip.
FIG. 1B shows a clock grid clock-distribution structure. The clock grid is the simplest clock-distribution structure and has the advantage of being easy to design for low skew. However, it is area inefficient and power hungry because of the large amount of clock interconnect required. Nevertheless, some chip vendors are using this clock structure for microprocessors.
FIG. 1C depicts a balanced tree clock-distribution structure. The balanced tree is the clock-distribution structure most commonly used in high performance chips. See, J. L. Neves et al., “Automated synthesis of skew-based clock-distribution networks,” VLSI Design, vol. 7, no. 1, pp. 31-57, 1998. In order to carry current to the branching segments, the clock line is widest at the root of the tree and becomes progressively narrower at each branch. As a result, the clock line capacitance increases exponentially with distance from the leaf cell (clocked element) in the direction of the root of the tree (clock input). Moreover, additional chip area is required to accommodate the extra clock line width in the regions closer to the root of the tree.
As shown in FIG. 1D, buffers may be added at the branching points of the balanced tree structure. Adding buffers at the branching points of the tree significantly lowers clock interconnect capacitance, because it reduces the clock line width required toward the root.
One prior art alternative to generating clock signals centrally and distributing them about the chip is to partition the chip design into blocks, as shown in FIG. 2. A synchronous clock signal is used only within a single block, while communication between different blocks is performed on an asynchronous basis. See, T. Meincke et al., “Globally asynchronous locally synchronous architecture for large high-performance ASICs,” IEEE Symposium On Circuits and Systems, Orlando, Fla., 30 May-2 June, Vol. 2, pp. 512-515, 1999.
In the past, clock design has not typically been considered within the context of full chip timing. Existing design methodologies typically treat clock skew as a problem to be eliminated, and most designers strive to achieve zero skew. However, producing clock signals with zero skew may not be the optimum way to achieve either the safest or the highest performance clock design. It is often the case that, even after zero skew is attained, chip failures are caused by simultaneous switching current or other timing related problems.
There remains a need for a method of coordinating the timing of clock and data signals on a chip that can be achieved with a simple design and minimum number of critical paths on the chip. It would be desirable to reduce the power consumption associated with clock-distribution lines or other chip timing circuitry. It would also be desirable to reduce the sensitivity of chip timing to process variations and various intermittent noises. Finally, there is an ongoing need for the development of higher speed methods for clocking data to provide enhanced chip performance.