1. Field of the Invention
The present invention relates circuitry for generating clock signals for a digital system. More specifically, the present invention relates to a method and an apparatus for generating and distributing a clock signal between components within an integrated circuit with substantially minimal clock skew.
2. Related Art
Synchronous systems, such as computers, rely upon a clock signal to maintain control of data transfers between system components. Typically, the clock signal is generated at a single source and is distributed through chains of inverters of equal length to the individual latches. It is important that the clock signal arrives at each data latch at nearly the same time, so that operations that take place in one part of a circuit are properly synchronized with operations in other parts of the circuit.
However, it is impossible to match exactly the delay of all paths from the source of the clock signal to the individual latches. Cross-die processing variations and imprecision in the alignment of the fabrication equipment make this impossible. To complicate matters, die sizes are becoming larger, resulting in greater die variations and longer inverter chains, which result in greater path disparities.
As clock speeds increase, these disparities consume an increasingly larger fraction of the clock period. The disparity in the arrival time of a clock signal to a latch is called xe2x80x9cskewxe2x80x9d. Note that skew causes uncertainty as to the time at which data is latched. Furthermore, note that calculations cannot be performed during periods when it is not certain that the data is valid. As clock speeds increase, the latch skew remains approximately constant. Hence, a smaller fraction of the clock period can be used for calculations. Note that as processor clock speeds increase, clock skews are beginning to approach the size of clock periods.
Clock skew can be compensated for by adding a timing margin to the clock cycle time. However, this added timing margin can become a significant fraction of the clock period, and can hence limit system performance.
One way so deal with this problem is to divide an integrated circuit into multiple clock domains that operate somewhat independently from each other. However, dividing an integrated circuit into multiple clock domains creates problems in synchronizing communications between the different clock domains.
What is needed is a method and an apparatus for generating and distributing a clock signal between components within a semiconductor chip so that circuit elements at different locations on the semiconductor chip remain properly synchronized at high clock speeds.
One embodiment of the present invention provides a system that generates a clock signal within an integrated circuit. This system includes four clocking elements, wherein each clocking element includes at least one input and at least one output, and wherein a signal at an input is complemented at a corresponding output. These clocking elements are spatially distributed throughout the integrated circuit, so that each clocking element provides the clock signal to a different region of the integrated circuit. These clocking elements are also coupled together though a plurality of interconnections, so that each output of each clocking element is coupled to at least one input of a neighboring clocking element. Furthermore, a given signal is inverted an odd number of times in traversing a closed path beginning and ending at any output of any of the four clocking elements and passing through a neighboring clocking element.
In one embodiment of the present invention, each of the four of clocking elements includes an even number of inverters and a keeper circuit.
In one embodiment of the present invention, each of the four clocking elements contains, a first node and a second node that are coupled together by a keeper circuit. Each clocking element also includes a first pair of inverters, each of which has an output coupled to the first node, and a second pair of inverters, each of which has an output coupled to the second node.
In one embodiment of the present invention, the keeper circuit includes a pair of cross-tied inverters coupled to the first node and the second node.
In one embodiment of the present invention, the first node and the second node of each of the four clocking elements comprise eight nodes that oscillate at the same frequency and are grouped into four synchronized pairs that are offset from each other in phase by 90 degrees, whereby the eight nodes provide a multi-phase clock signal.
In one embodiment of the present invention, the system also includes a controllable voltage source coupled to the four clocking elements, whereby varying a voltage provided by the controllable voltage source varies an oscillation frequency of the four clocking elements.
In one embodiment of the present invention, the system also includes a plurality of clock distribution networks within the integrated circuit, wherein each clock distribution network distributes the clock signal from a clocking element to circuit elements within an associated region of the integrated circuit.
In one embodiment of the present invention, an output of each clocking element is coupled to at least one input of each of two neighboring clocking elements. In a variation in this embodiment, the four clocking elements comprise a given ring within a higher level ring of rings, wherein at least one input of a first clocking element in the given ring is coupled to an output of a second clocking element in a first neighboring ring, and wherein at least one output of a third clocking element in the given ring is coupled to an input of a fourth clocking element in a second neighboring ring.
Note that a clock signal in a conventional clock distribution system is generated from a single source. Whereas, the present invention generates a clock signal through the interaction of a large number of clocking elements distributed across the semiconductor die. Furthermore, note that a conventional clock distribution scheme is an open loop system. Hence, once the clock signal is generated it is propagated to the latches without compensation for die variations or transistor variations along the chain of inverters to the individual latches. In contrast, the present invention provides a closed loop system that adjusts to the actual conditions on the semiconductor die.
Furthermore, the maximum phase error in the present invention scales in proportion to the deviation from the average of the fastest and slowest transistors on the chip, whereas in a traditional clock tree, the error scales with the difference of speed between the fastest and slowest transistors.
Moreover, unexpected transistor strengths and loads cause extra or reduced voltage swings, but have little effect on phase. Hence, these unexpected delays have a second order effect in the present invention, whereas in a traditional clock tree, these unexpected delays have a first order effect on delay.