High bandwidth chip-to-chip interconnection, also referred to herein as a “link,” is a crucial part of many systems today. It is to be understood that the term “chip” is used herein to generally refer to an integrated circuit. High speed inputs/outputs (I/Os) are extensively used in server processors, memory-central processing unit (CPU) interfaces, multiprocessor systems, and gaming applications. With increasing speed of on-chip data processing, there is an increasing demand for higher data rates and higher number of I/O pins per chip.
However, limitations on power consumption, area per I/O, channel bandwidth, as well as the characteristics of advanced submicron complementary metal oxide semiconductor (CMOS) technologies, make design extremely challenging. Reducing power consumption, having a technology-friendly design and ability to monitor the channel, test and diagnose the problems in the link are among the most important requirements of these systems.
In particular, data recovery and synchronization at the receiver side is very important, but can consume a significant amount of power. For example, in a source synchronous application, a clock signal is sent along with the data from a source (e.g., a first chip) to a destination (e.g., a second chip). In such an application, the clock may be recovered at the receiver, and then the data, by properly adjusting the phase of the clock to be in synchronization with the data.
There are a number of different receiver architectures that are used for such applications. A widely-used synchronization technique involves sampling the input waveform more than once per bit time, see, e.g., R. Farjad-Rad et al., “A 0.3-um CMOS 8-Gb/s 4-PAM Serial Link Transceiver,” IEEE Symposium on VLSI Circuits, June 1999. Such sampling typically includes one sample in the middle of the bit and one extra sample at the edge, where the transitions take place. The edge sampling provides phase information for phase recovery as part of a Delay Locked Loop (DLL) or Phase Locked Loop (PLL) to generate a clock in-phase with the incoming data. One way to design the PLL is to have a local DLL that generates the multi-phases and then uses interpolators to build a phase rotator system, see, e.g., S. Sidiropoulos et al., “A Semidigital Dual Delay-Locked Loop,” IEEE Journal of Solid-State Circuits, November 1997.
However, there are several drawbacks to existing solutions. For example, the analog content of DLLs and PLLs makes the design challenging and less technology-friendly due to errors associated with phase detectors and leakage in the filter capacitors. Further, in order to monitor the link, extra samples are required, which adds to power consumption and area. Still further, the static phase offset between the sampling phases reduces the timing margin of the link.
Accordingly, a need exists for improved clock synchronization and data recovery techniques.