1. Field of the Invention
The present invention relates to a CMOS clock recovery circuit for integrated circuit applications. More particularly, the invention concerns a CMOS clock recovery circuit that establishes and maintains proper alignment between data and clock signals at high frequencies.
2. Description of the Related Art
Metal-oxide semiconductor (MOS) technology is virtually the standard for digital circuits that are used for telecommunications, datacommunications, and computers. Increasingly, CMOS (complementary MOS) technology is used in these applications. CMOS technology incorporates both n-channel MOS and p-channel MOS transistors in the same monolithic structure. High speed and low power consumption are often desired in CMOS circuits in many applications.
In high speed CMOS circuits, there is often a need to maintain proper alignment between input data and the clock, which are also referred to as input data signals and clock signals. It is known that proper alignment between data and the clock can be achieved by temporarily either speeding up or slowing down the clock. "A Self Correcting Clock Recovery Circuit", Journal of Lightwave Technology, Vol. Lt-3, No. 6, December 1985, by Charles R. Hogge, Jr., which is incorporated herein by reference, discloses circuits, referred to as "clock recovery circuits", for automatically maintaining a desired alignment between input data and the clock, by automatically adjusting the voltage connected to the input of a voltage controlled oscillator, which outputs the clock signal.
FIG. 1 illustrates pertinent portions of a clock recovery circuit 100 disclosed by Hogge. FIG. 7 is an idealized timing diagram that illustrates the operation of the circuit of FIG. 1, but without indicating the effects of propagation delays and other similar delays. Data di and clock cky are inputted into a first flip flop 103. The output of the first flip flop is a1y, which is a delayed version of di. a1y and not-clock ckby are inputted into a second flip flop 105. The output of the second flip flop is a 2x, which is a further delayed version of di. Data di and the output of the first flip flop a1y are inputted into a first exclusive-or (x-or) gate 110, and the output of the first flip flop a1y and the output of the second flip flop a2x are inputted into a second x-or gate 115.
The output 1x from the first x-or gate is a stream of variable width pulses, with the leading edges of the pulses coinciding with each positive and negative transition of the data di. The width of each output pulse 1x is determined by the position of the clock pulses cky relative to the data pulses di. The output from the second x-or gate is a stream of fixed width pulses 2x having 50% duty cycles, with the leading edges of these pulses coinciding with the trailing edges of the output pulses 1x from the output of the first x-or gate.
It is desired to have the transitions of the data pulses di coincide with the trailing edges of the clock pulses cky and the leading edges of the not-clock pulses ckby. If the trailing edges of the clock pulses cky are to the left of the transitions of the data, then the pulses cky and ckby are advanced relative to the data, as is the case in FIG. 7. When the clock pulses are advanced, the width of the pulses 1x from the output of the first x-or gate is reduced to less than a 50% duty cycle, with the extent of the reduction of the width of the pulses being proportional to the extent that the clock is advanced relative to the data. If the trailing edges of the clock pulses cky are to the right of the transitions of the data, then the pulses cky and ckby are retarded relative to the data. When the clock is retarded, the width of the pulses 1x from the output of the first x-or gate is increased to greater than a 50% duty cycle, with the extent of the increase in the width of the pulses being proportional to the extent that the clock is retarded relative to the data. When the clock has the desired alignment with the data, the width of the variable width pulses 1x from the output of the first x-or gate is identical to the width of the fixed width pulses 2x from the output of the second x-or gate, except for jitter errors.
With additional circuitry not shown in FIG. 1, the average of the output of the second x-or gate is subtracted from the average of the output of the first x-or gate to produce a difference signal that is used to control the frequency of a voltage controlled oscillator (VCO). The output of the oscillator is the clock cky. Thus, the circuit automatically varies the frequency of the VCO in order to establish and maintain the desired alignment between the clock and the data.
FIG. 3 is a schematic diagram of a CMOS differential latch 300. FIG. 4 is a schematic diagram of a CMOS differential flip flop 400. This discussion applies to single-ended, as well as differential, latches and flip flops. It can be seen that the flip flop is essentially two interconnected latches, which can be referred to as a master latch and a slave latch. Because this type of flip flop consists of a master latch and a slave latch, this flip flop can be referred to as a master-slave flip flop.
As illustrated in FIG. 2, the first flip flop 103 of the circuit 100 of FIG. 1 consists of first master latch 200 and first slave latch 205, and the second flip flop 105 consists of second master latch 210 and second slave latch 215. As can be seen in FIG. 8, when clock cky goes from low to high, the output a1y of the first slave latch, which is the output of the first flip flop, does not change until after the "clock to Q" delay through the first master latch and the first slave latch, referred to as ck.fwdarw.Q. FIG. 8 also shows that when the clock cky is high, the output w2bx2 of the second master latch 210 does not change when the input a1y to the second master latch changes until after the propagation delay through the second master latch, referred to as pd3. When the not-clock pulse ckby goes from low to high, the output of the second master latch w2bx2 is transferred to the output of the second slave latch. The signal output of the second slave latch is designated a2x. In order for the signal w2bx2 at the input of the second slave latch to be accurately transferred to the output of the second slave latch when ckby goes from low to high, the signal w2bx2 must have been present at the input to the second slave latch a minimum period of time, referred to as the required setup time Rsetup for the second slave latch. The actual setup time, which is referred to as setup4, is the time between the end of the propagation delay pd3 and the instant that ckby switches from low to high. If setup4 is less than Rsetup, w2bx2 may not be accurately transferred to the output of the second slave latch.
To summarize, when cky goes from low to high, the signal at the input to the first master latch, which is the input to the first flip flop, will be transferred to the output of the first slave latch, which is the output of the first flip flop, after a delay of ck.fwdarw.Q which is the delay through the first flip flop. Also, when the clock cky is high, the signal a1y at the input to the second master latch will be transferred to the output of the second master latch, after the propagation delay through the second master latch pd3. The signal w2bx2 at the output of the second master latch must be present for at least the required setup time Rsetup, in order for the signal w2bx2 to be accurately transferred to the output of the second slave latch when ckby goes from low to high. Thus, in order for the data to be accurately transferred from the input of the first flip flop to the output of the second flip flop so that the circuit will function properly, the time between when the clock cky goes from low to high and when the not-clock ckby goes from low to high must be a minimum of ck.fwdarw.Q+pd3+Rsetup. It follows that the maximum frequency of operation for this circuit is approximately: EQU 1/2(ck.fwdarw.Q+pd3+Rsetup).
The values of ck.fwdarw.Q, pd3, and Rsetup, are functions of the integrated circuit (IC) fabrication process used, and consequently, the maximum frequency of operation of the circuit of FIG. 1 is also a function of the fabrication process. For example, with a 0.35 micron process, the maximum frequency of operation of the circuit of FIG. 1 is approximately 2.0 gigahertz (GHz).
Due to the increased costs of manufacturing IC's with smaller geometry processes that allow higher frequency operation, there is a desire to operate circuits manufactured with a particular process at higher frequencies. However, for any given fabrication process, the maximum frequency of operation of the circuit of FIG. 1 is limited due to the fact that there are two delays, ck.fwdarw.Q and pd3, in addition to the required setup time Rsetup. Another shortcoming of the circuit of FIG. 1 is that there are four latches loading the clock and consuming power.
Hogge discloses an additional clock recovery circuit that uses a delay line instead of the second flip flop 105. The delay line cannot be implemented in a monolithic integrated circuit (IC), and is typically implemented as a microstrip circuit on a ceramic substrate. Due to the fact that the delay line cannot be included in the IC, use of a delay line is undesirable for several reasons. First, using a delay line is more expensive than a circuit that is implemented entirely on a monolithic IC, due to the cost of the delay line which is an additional component. Second, the delay line requires additional space because it is not part of the IC. And third, the delay line can be subject manufacturing variations independent of the IC.
Consequently, there is a need for a low power CMOS circuit that can be implemented on a monolithic IC, that establishes and maintains proper alignment between data and the clock at frequencies higher than the maximum frequency of operation of the circuit of FIG. 1 for any given IC manufacturing process.