As the operating frequency of complex digital communication and data transfer systems increases, a major technical challenge has been to operate an entire system in a synchronous manner. A typical complex digital system consists of various subsystems where the exchange of information among the subsystems is synchronized by a fixed frequency and fixed phase global reference signal (i.e., "reference clock") available to all subsystems. When the operating frequency extends above several hundred megahertz (MHz), however, it is difficult within reasonable hardware costs to distribute the reference clock to the various subsystems and still maintain an acceptable phase difference (i.e., "clock skew"). This is mainly due to the inherent electrical propagation delay along the various physical signal paths among the subsystems.
To alleviate such stringent synchronization requirements it has been a growing trend to limit the synchronous operations to within each subsystem, while providing interactions among subsystems of the complex network asynchronously (i.e., "asynchronous interconnections"). Typically, a subsystem is a collection of VLSI modules placed close together such that their electrical signal path delays are small enough to perform their operations in a synchronous manner at a given clock frequency above several hundred megahertz (MHz), without paying the high cost penalty otherwise associated with full network synchronous clock distribution. Each localized synchronous subsystem only receives a frequency-locked reference clock instead of a clock which is both frequency-locked and phase-locked. As a result, elaborate and costly distribution of a high speed clock signal throughout a network is avoided.
To exchange information asynchronously among synchronous subsystems, each synchronous subsystem should have suitable transmitter and receiver modules to transmit data to and retrieve transmitted data from other subsystems of the network. Again, this assumes the absence of a traditional frequency-locked and phase-locked global reference clock. Ideally, the receiver module should retrieve data within a few bit-time periods under the presence of frequent phase discontinuity on the input bit-stream. Furthermore, it is highly desirable to combine the transmitter and receiver modules in a single transceiver macro, and to have multiple transceiver macros integrated in a single chip without the need for analog or high-precision digital circuits. This is because of the chip's repeated use in the system as a building block.
Prior-art approaches to data retrieval from a serialized bit stream can be grouped into two categories: (1) phase-locked loop approaches (e.g., W. C. Lindsey and C. M. Chie, "A Survey of Digital Phase-Locked Loops," IEEE Proc. Vol. 69, pp. 410-431, April 1981; R. M. Hickling, "A Single Chip 2 Gbit/s Clock Recovery Subsystem for Digital Communications," Proc. of RF Technology Expo. 88, published by Cardiff Publishing, Anaheim, Calif., pp. 493-497, Feb. 10-12, 1988; J. D. Crow, et al., "A GaAs MESFET IC for Optical Multiprocessor Networks," IEEE Trans. Electron Devices, Vol. 36, No. 2, pp. 263-268, February 1989; S. Hao and Y. Puqiang, "A High Lock-In Speed Digital Phase-Locked Loop," IEEE Trans. of Comm., Vol. 39, No. 3, pp. 365-368, March 1991); and (2) phase-alignment approaches (such as R. R. Cordell, "A 45 M-bit/s CMOS VLSI Digital Phase Aligner," IEEE J. of Solid-State Circuits, Vol. 23 , No. 2, pp. 323-328, April 1988; and B. Kim, D. N. Helman, P. R. Gray, "A 30-MHz Hybrid Analog/Digital Clock Recovery Circuit in 2-um CMOS," IEEE J. of Solid-State Circuits, Vol. 25, No. 6, pp. 1385-1394, December 1990).
In a phase-locked loop approach, the goal is to generate a frequency-locked and phase-locked timing signal (i.e., reference clock) locally by adjusting the frequency and phase of an internal oscillator (VCO) to that of a received data bit stream. Use of a phase-locked loop approach in a transceiver for high speed interconnection applications is undesirable since such circuits inherently require a very long re-synchronization time (e.g., on the order of a few hundred bit-times) and require high precision analog components, such as digital-to-analog or analog-to-digital converters, along with a voltage-controlled or current-controlled oscillator. Furthermore, existing data recovery schemes based on phase-locked loop approaches prohibit the integration of a transmitter module and a receiver module in a single chip due to the interference and noise experienced with two voltage-controlled oscillators (VCOs) (one for the receiver and another for the transmitter) on the same chip.
In general, phase-locked loops are employed mainly in applications where signal quality is poor and re-synchronization time is unimportant, such as telecommunication applications. Unlike the telecommunication environment, however, an asynchronous interface environment among subsystems within a complex digital system provides better signal quality at the receiving end. Also, only a frequency-locked reference clock signal, having a lower frequency than the frequency of data transfer, is generally available within a digital system, i.e., without additional hardware cost.
To achieve fast re-synchronization based on the characteristics of asynchronous interconnects within a digital system, several approaches employing phase-alignment have been proposed. These approaches retrieve data from an input bit-stream using an externally supplied, frequency-locked clock signal. By way of example, reference W. M. Cox and M. A. Fischer, "Metastable-free Digital Synchronizer With Low Phase Error," U.S. Pat. No. 5,034,967, Jul. 23, 1991; R. D. Henderson and R. K. Yin, "Method and Structure for Digital Phase Synchronization," U.S. Pat. No. 5,022,056, Jun. 4, 1991; C. G. Melrose and J. D. Rose, "Digital Phase-Locked Device and Method," U.S. Pat. No. 4,972,444, Nov. 20, 1990; and A. J. Boudewijns, "Phase Detection Circuit for Stepwise Measurement of a Phase Relation," U.S. Pat. No. 4,965,815, Oct. 23, 1990.
Referring to FIG. 1, these approaches typically have in common the following layout 10:
1. Acquisition of a set of lead-lag, binary phase state variables (`n` variables), which represent the binary phase relationship (lead or lag) with respect to the reference clock, using an input delay module 12 and an array of flip-flops 14 (phase sampling flip-flops), PA1 2. Processing of the lead-lag phase state information to select an optimally delayed input signal among the delayed input replicas, such that both the reference clock and the selected input replica are in phase (output selection logic 16), and PA1 3. Selection of an output among the delayed input replicas provided by input delay module 12 using an `N-to-1` multiplexer (N-to-1 Mux 18).
The acquisition of a set of phase state variables, processing of the lead-lag phase state variables, and selection of an optimal output with respect to the reference clock signal are each well known functions. Further, a number of output selection algorithms are available in the open literature, as represented by the above-referenced United States patents.
Unfortunately, all known phase-alignment approaches result in significant technical difficulties and fail to produce an error free output data-bit stream under real operating conditions, i.e., under the presence of random phase jitter and/or noise on the input signal. This is due to the approaches' inherent phase sampling mechanism (`Modulo 2 .pi. phase error measurement`), where phase error between the input signal and reference clock is only measured within an interval of 0 to 2 .pi.. Note that `one bit-time` in the time domain is equivalent to 2 .pi. in the phase domain. For example, when an input signal transition randomly occurs and input signal noise increases, the phase error from a previously adjusted cycle may result in prior art phase aligners slipping a cycle, i.e., slipping one bit-time. The net effect of a slipped cycle on a phase aligner output is a data bit error, wherein either a data bit is dropped (`negative slip` if the phase error increased by -2 .pi.) or a data bit is duplicated (`positive slip` if the phase error increased by +2 .pi.). Output bit error due to a slipped cycle(s) using prior art phase-alignment approaches becomes worse when the phase difference between the input signal and the sampling clock signal is close to the boundary of the reference bit-time, i.e., 0 or 2 .pi..
In addition, most known phase-alignment approaches are sensitive to circuit element variations when adjusting the phase of a reference clock signal to sample an input signal, primarily due to the art's difficulty in processing a large number of phase state variables. Ideally, an optimally delayed input replica should be selected by decoding 2.sup.N possible phase states (out of N phase state samples) during a bit-time to ensure against output bit error due to flip-flop metastability. Since decoding of all possible phase state combinations is impractical using heretofore known approaches, a small number of phase states of interest are usually decoded. To guarantee the selection of only one signal from `n` phase shifted signals, the prior art thus assumes that there are less than one or two incorrect phase samplings caused by metastability of the phase sampling flip-flops. However, since there is typically a high possibility of having multiple incorrect phase samplings around transitions of the input signal at higher data rates, it would be desirable to consider all `n` phase state variables to select an appropriate phase shifted variable, while still providing fast phase synchronization at high speed operation. Further, the prior phase alignment approaches typically require precision analog or digital circuit components, and provide no slip-cycle compensation. The synchronizing technique disclosed herein addresses all of these problems, limitations and omissions of known phase sampling approaches.