In many electronic applications, for example digital CDRs (Clock Recovery Unit), it is required to generate a clock signal with a programmable phase shift with respect to a reference clock.
In a number of situations, when transferring data between different chips, boards or devices, the associated clock is usually not distributed. The main reason is pin count reduction and power saving. At the receiving end, the problem of recovering the associated clock arises, in order to sample and process the incoming data stream. The operation of phase aligning often cannot be avoided also when the associated clock signal is distributed along with the data signal.
It is possible to design a clock recovery circuit working without a reference clock under precise assumptions on the data pattern and the local VCO frequency tuning range. Since these hypothesis is often not met in the applications, the known solutions mainly require a reference clock frequency within a well defined tolerated range.
A number of known techniques are already available for generating a clock signal with a programmable phase shift, namely delay locked loops (DLL), phase locked loops (PLL), open loop delay lines, digital phase aligners (DPA).
PLL based solutions require considerable power and chip area and are generally not able to cope with a wide range data transition density or long CID (continuous identical digits) sequences, as often required by applications. Often a PLL is used to generate N phases of the reference clock. They are all distributed to each receiving macro in which one is selected in order to sample the incoming data. This solution requires a lot of area for the wiring. Besides, switching noise, variations in phase difference between the clock multiphases and duty cycle distortion become a challenging issue when covering a long path; in addition the mimimum distance in degrees between two adjacent phases is limited by the technology used for the chip.
In other proposed schemes, one PLL is used to generate one filtered clock phase which is then distributed to all the receiving macros. Locally all the phases are generated by means of a DLL. Power consumption and occupation area remain a severe issue. Also in these cases the mimimum distance in degrees between two adjacent phases is limited by the technology.
Cases in which the multi-phase clock is generated by means of an open loop delay line are also known. In this schemes, power consumption (all the phases are generated also if not used) is an issue. Moreover the whole algorithm is complicated because the phases do not cover 360° and the phase spacing is PTV (process, temperature and supply) dependent and limited.
Solutions that delay the data (digital phase aligners, DPA) are also known. The main drawback is that the delay chain length is supposed to cover the jitter tolerance amplitude and not only the clock period, which results in longer delay chains. This implies more eye closure and again a PTV dependent and limited phase spacing. Moreover, an architecture that delays the data requires the local availability of the exact transmitter clock frequency.