Modern data processing systems require the rapid transfer of data between integrated circuits (“chips”). For example, a central processing unit (CPU) transfers data to the memory system, which may include a memory controller and off-chip cache. In a multi-CPU system, data may be transferred between CPUs. As CPU speeds increase, the speed of the interface between chips (bus cycle time) becomes a limiting constraint because latencies across the interfaces may exceed the system clock period.
When data is launched from one chip to another chip, it can be launched simultaneously within numerous clock/data groups. Each clock/data group consists of multiple data bits and a clock signal, each of which travels over an individual conductor. Due to process variations and varying conductor lengths, the individual bits within a clock/data group may arrive at the receiving chip at different instances. Therefore, the individual bits of data and the clock within a clock/data group must be realigned upon arrival on the receiving chip. At the receiving end, the clock/data signals can be delayed to align the signals with respect to a sampling edge of the received clock. While, aligning the individual data bits within a clock/data group at the receiving end is necessary, such delays can cause jitter and other forms of distortion. In addition to causing jitter and distortion, delaying data signals can require extensive administrative overhead and additional circuitry.
Thus, there is a need in the art for apparatus and methods to accommodate high speed data transfers between chips in data processing systems. In particular, there is a need for mechanisms to ensure data synchronization at a receiving chip while limiting the associated jitter and distortion that is often created during such synchronization.