Very large integrated circuits often operate synchronous to a clocking signal which acts as a timing reference. A great variety of devices operate in this manner. Perhaps most notable within this class of circuits are microprocessors and other data processing devices which can operate at frequencies up to 100 MHz. Future generations of processors are expected to approach astonishing speeds, e.g., 500 MHz to greater than 1 GHz.
In such circuits there is a need to couple the clocking signal to each of the functional blocks distributed about the semiconductor chip. This means that integrated circuits operating synchronously, such as a microprocessor, have a need for a network that distributes the clock signal across the chip. In a typical microprocessor, for example, the clock signal is often generated internal to the chip from an external signal that provides a reference frequency input. The external clock signal is commonly derived from a crystal resonator circuit. The internally generated reference clock signal is then coupled to the various functional units or logic clusters of the microprocessor. Synchronous logic functions obviously imply the need for some sort of clock distribution network.
As operating frequencies for very large integrated circuits such as microprocessors increase, the problem of how to effectively synthesize the clock signal across the chip becomes more difficult to solve. The reason is because a normal clock signal distribution network introduces different delays to the clock signal (i.e., clock skew) in different branches of the network. The factors causing clock skew include electromagnetic propagation delays (RCL), buffer delays in the distribution network, and resistive-capacitive (RC) delays associated with the various distribution lines which make up the entire distribution network. In addition, clock skew can vary across the surface of the semiconductor die due to variations in the manufacturing process, temperature gradients, power supply variations and differing load capacitance.
To give a better idea of the monumental task facing circuit designers and computer architects, future generations of microprocessors are targeted to operate at frequencies of 500 MHz and greater. At these extraordinary frequencies, the clock signal must still be coupled to more than ten million transistors distributed about a semiconductor die having an area of approximately 650 mils.sup.2.
One of the major difficulties distributing a high-speed clock signal globally across a spacially huge microprocessor chip is the problem of logic gate loading. In the past, various techniques have been proposed to eliminate clock skew within a clock signal distribution network. These approaches have generally included the use of a chain of isolation buffers that try to drive the load capacitance of the logic gates without delay. The prior art includes numerous examples of different clock distribution networks designed to achieve low clock skew across a large chip. For example, U.S. Pat. Nos. 5,289,866; 5,307,381; 5,339,253; 5,361,277; 5,376,842; 5,397,943; and 5,398,262 describe clock distribution networks and circuitry all sharing the common goal of reduced clock skew in a very large scale integrated circuit such as microprocessor.
As will be seen, the present invention provides a method and apparatus for clock signal distribution that is ideally-suited for a high-performance, high-frequency data processing device. The invention enables a high-frequency clock (e.g., 500 MHz or higher) to be distributed in a high-performance circuit such as a microprocessor with a minimum amount of skew relative to a global system clock. The invention also minimizes the amount of skew variability in the clock distribution network arising from die interconnect resistance, interconnect capacitance, interconnect inductance and transistor parameter variability across the die. Furthermore, the invention reduces sources of phase jitter at the clock distribution end points.