Virtually every circuit design in modern electronic systems involves some type of data transfer between integrated circuit (IC) chips, possibly located on separate printed circuit boards. For digital systems, the data is typically transferred at the transitions of a clock signal from a register or flipflop on a sending IC to a similar device on a receiving IC. For error free transmissions, the clock signals appearing at each register should operate in-phase with each other and have equal edge transition rates. Such clock signals may originate from a common source which is buffered through one or more clock trees for distributing like clock signals to the flipflops and registers disposed throughout the ICs. The clock distribution tree may comprise a plurality of buffers wherein the output of one buffer drives the input of several others such that the clock signal is distributed evenly throughout. With a clock distribution tree each buffer drives a maximum of ten to twelve other buffers, flipflops or registers thereby avoiding the stress and poor performance of overloading the source circuit. In addition, each signal path through the clock distribution tree is made the same length in terms of propagation delay for providing the plurality of output clock signals operating in-phase with equal edge transition rates. The sending and receiving ICs each typically house a dedicated internal clock distribution tree for supplying the clock signals for its respective flipflops. Therefore, the output clock signals of the clock distribution tree of the sending IC should also operate in-phase with, and should have substantially equal edge transition rates, as the output signals of the clock distribution tree of the receiving IC. Unfortunately, there is often timing skew between the output clock signals of the clock distribution trees of the sending and receiving ICs.
In order to latch the incoming data, it is important to maintain the correct timing relationship between the arrival of the data signal at the D-input of the receiving flipflop and the transition of the clock signal through the clock distribution tree. If the clock transition occurs before the data becomes valid a setup time problem exits. Alternately, a hold time problem occurs if the data is no longer valid when the clock signal arrives. Such timing considerations are especially important for systems operating at a high data rate, say 50 megahertz and above. One of the principal causes of the noted setup and hold time problems is the timing skew between the clock signals from a first clock distribution tree on the sending IC and the clock signals from a second clock distribution tree on the receiving IC. Depending upon the number of buffer layers within the first and second clock distribution trees and associated insertion delay, it is possible to insert appreciable skew between the output clock signals thereof. For example, the first clock distribution tree in the sending IC may have only two or three buffer layers for driving a small number of registers while the second clock distribution tree in the receiving IC may include four or five buffer layers for driving many more flipflops and registers. The delay through the second clock distribution tree is therefore longer than that through the first clock distribution tree producing the aforedescribed timing skew and possibly causing data transmission errors from the phase difference between the output clock signals of the clock distribution trees of the sending and receiving ICs.
One known solution to the problem of timing skew between the clock distribution trees of sending and receiving ICs is the use of a phase lock loop to compensate for the differences in respective delays. In general, a primary clock signal is applied through a selectable delay line to the input of the first clock distribution tree of the sending IC. The primary clock signal is also applied at the first input of a phase detector while the second input of the phase detector is coupled to one output of the clock distribution tree. The output of the phase detector controls the selectable delay line such that a known phase relationship is established between the primary clock signal and the output clock signals of the clock distribution tree. A similar phase lock loop is provided for the clock distribution tree of the receiving IC for establishing a similar phase relationship between the primary clock signal and its output clock signals thereby maintaining the clock signals in the sending and receiving ICs substantially in-phase.
The phase comparison and correction of most if not all such phase lock loops is completed in one period of the primary clock signal, hence, the timing of the signal propagations becomes very important. Indeed, the maximum operating frequency of the phase lock loop is restricted by the time required to perform the phase comparison and correction in one period of the primary clock signal. Consequently, the phase lock loop as described above is typically implemented in known environments wherein the transistor sizes are variable and may be selected according to predetermined design parameters. For such dedicated uses, the size of the clock tree and the length of the delay lines are also known ahead of time simplifying the design. Furthermore, the components of the conventional phase lock loop are often grouped together and centrally disposed on the IC for minimizing propagation delay and timing skew although at the expense of consuming large portions of prime area on the IC.
In the world of gate arrays as used on ASIC (Application Specific Integrated Circuit) type integrated circuits, the designer must deal with fixed transistor geometries of say 48/4 microns width and length as part of standard core cells. The advantage of gate arrays in simplifying the design process is well known in the art. Yet, the fixed transistor sizes associated therewith make the implementation of the phase lock loop much more difficult in that the designer may no longer manipulate delay and drive parameters by adjusting individual transistor geometries. Indeed, most if not all known gate array libraries fail to provide a phase lock loop macro with fixed geometry transistors for maintaining synchronized clock signals throughout the system.
A phase detector is one of the components of the phase lock loop. Again conventional phase detectors often rely upon customization of the transistor geometries to achieve the desired drive levels and propagations delays therethrough. Such transistor personalizing is impractical in a gate array since the phase detector is constructed with standard transistor cells having fixed geometries. Furthermore, many phase detectors compare the phase differential of the input signals and issue one or more control signals for performing the phase correction during one period of the primary clock signal. The time required to update the selectable delay line in response to the control signals in preparation of the next phase comparison all during one period of the primary clock signal limits the maximum operating frequency. Another problem for conventional phase detectors is their tendency toward providing phase correction regardless of the input phase differential.
A selectable delay line is another component of the aforedescribed phase lock loop. The conventional delay line may comprise a string of serially coupled inverters with tap points at every other inverter for providing an output signal operating in-phase with the primary clock signal yet delayed by a selectable amount. The resolution is thus two inverter delays. Since the minimum phase differential achievable with the phase lock loop is directly related to the resolution of the delay line, a higher resolution (lower incremental delays) allows the output signals of the clock distribution tree to be positioned closer to the phase of the primary clock signal. This increases the resolution of the phase lock loop and reduces the phase error between the output clock signals of the clock distribution trees on the sending and receiving ICs. In the previous data transfer example, a smaller nominal phase difference between the output signals of clock distribution trees of the sending and receiving ICs reduces the chance of data transmission errors and allows higher data rates. Thus, it is desirable to provide smaller incremental delays through the selectable delay line than the conventional two-inverter steps.
In many circuit designs such as the selectable delay line of the phase lock loop, it is desirable to generate opposite phase clock signals operating at the same frequency with substantially 180 degrees phase difference. The common approach is to split an input clock between first and second conduction paths; one having an even number of inverters and the other having an odd number of inverters. The transistors through the first and second conduction paths are custom sized for providing equal propagation delays through the first and second conduction paths. Yet in gate arrays, the luxury of personalizing transistor geometries is impractical because the available standard gate array cells have fixed transistor sizes. Another approach is needed for generating the opposite phase clock signals with equal edge transition rates from fixed geometry transistors.
Hence, what is needed is an improved phase lock loop on a gate array using macros with standard geometry transistors for providing a known phase relationship between the primary clock signal and the output clock signals of the clock distribution tree on the ASIC chip.