This application relates to integrated circuits, particularly to timing of data transfer between logic elements.
xe2x80x9cSet-up timexe2x80x9d and xe2x80x9chold timexe2x80x9d together describe the timing requirements on the data input of a sequential logic element, such as a flip-flop or register, with respect to a clock input. The set-up and hold times define a temporal window during which data must be stable to guarantee predictable performance over a full range of operating conditions and manufacturing tolerances. The set-up time SUT is the length of time that data must be available and stable on the input terminal of a storage element before arrival of a clock edge for the data to be captured by the storage element; the hold time HT is the length of time that the data must remain stable after the arrival of the clock edge.
FIG. 1 (prior art) depicts three clock-to-data timing scenarios that illustrate the relationships between set-up time, hold time, and clock edges for a given flip-flop. The depicted waveforms include sharp signal transitions for ease of illustration; in practice, many variables, including process, temperature, and supply voltage, impact precise edge placement for data and clock signals. The set-up and hold times for a given storage element must meet the requirements for the storage element and account for relative timing variations between the clock and data.
Referring to the first example, a data pulse 100 arrives too late with respect to a clock edge 105 to meet the set-up time requirement, so the flip-flop (not shown) does not capture the data; consequently, the Q output signal is indeterminate.
In the second example, a second data pulse 110 arrives early enough to meet the set-up time requirement, but does not remain high long enough with respect to clock edge 115 to meet the flip-flop""s hold time requirement; consequently, the Q output signal is again indeterminate. In the final example, a third data pulse 120 remains stable and valid with respect to a clock edge 125 over a time window that meets both the set-up and hold time requirements. The flip-flop therefore captures the data, causing the output signal Q to transition to a level representative of a logic one.
Set-up and hold-time requirements between flip-flops or registers on the same chip can be met by careful design of the on-chip clock distribution network. It can be difficult, however, to avoid set-up and hold-time problems for sequential storage elements that communicate with data sources external to the chip.
FIG. 2 (prior art) is a simplified diagram of the input portion of a conventional programmable input/output block (IOB) 200 that addresses potential hold-time problems. Input block 200 includes an input buffer 205, programmable delay circuit 210, a sequential storage element 215, and three programmable multiplexers 220, 225, and 230. A programmable multiplexer 240 can be programmed to insert one or both of delay elements 235 into the incoming data path to compensate for clock delays induced by relatively long signal paths in the clock distribution network.
The delays through clock and data paths can vary considerably. The input delay imposed by input block 200 for a given data signal is therefore selected to be relatively large to account for extreme cases. The resulting set-up times work well for relatively low-frequency signals, but unnecessarily limit the maximum operating frequency of IOB 200. This problem is illustrated below in connection with FIGS. 3, 4A, and 4B.
FIG. 3 (prior art) depicts an integrated circuit 300 connected to a simple three-bit bus 303. Three lines D0, D1, and D2 provide parallel data to three respective input blocks 305, 310, and 315 of integrated circuit 300. The data signals D0, D1, and D2 are synchronized to a clock signal CLK on a like-named terminal. (Throughout the present disclosure, signal nodesxe2x80x94e.g., lines, terminals, or padsxe2x80x94and the signals they carry are referred to using like designations; in each case, whether a given reference is to a signal or the corresponding node will be clear from the context.) Input blocks 305, 310, and 315 supply the synchronized data from bus 303 to some core logic 320, which performs some logic operation on the received data.
FIG. 4A is a waveform diagram 400 depicting an example in which the data provided on terminals D0, D1, and D2 to integrated circuit 300 of FIG. 3 are timed slightly differently with respect to clock signal CLK. Despite the timing differences, each data stream satisfies the set-up and hold time requirements for input blocks 305, 310, and 315, and are consequently captured without error.
FIG. 4B is a waveform diagram 450 depicting an example in which timing differences between the data provided on terminals D0, D1, and D2 introduce data errors. The timing differences between the respective data and clock signals are the same as in FIG. 4A, but the shorter period of the clock and resultant reduced data windows cause circuit 300 to latch incorrect data. At time T1, for example, only input block 310 is likely to latch the correct data DT1. As is apparent from this illustration, the effects of timing errors grow more problematic with increased clock frequency. This problem is growing ever more severe as new integrated circuits send and receive data at ever-greater speeds to compete in markets where speed performance is paramount.
To emphasize a problem addressed by the present invention, waveform diagram 450 illustrates an extreme case. Nevertheless, even minor differences in signal-propagation delay between different bits sampled on the same clock edge can introduce undesirable errors. There is therefore a need to more precisely align clocks and data, and in particular a need for improved means for providing per-bit data alignment for high performance integrated circuits.
The present invention addresses the need for precise, per-bit data alignment for high performance integrated circuits. Circuits and methods in accordance with some embodiments separate incoming data into three differently timed data signals: an early signal, an intermediate signal, and a late signal. The timing of the three data signals can be collectively moved with respect to the clock signal. Moreover, the temporal spacing between the three signals can be adjusted so that the early and late signals define a window centered on the intermediate signal.
In a typical example, the three signals are collectively aligned with the clock. Thus aligned, the three signals are stepwise separated in time until the intermediate data signal is centered on an edge of the clock. The early and late data signals can then be periodically compared with the intermediate data signal. Mismatches between the intermediate data signal and either the early or late data signal indicate that the data has drifted in time relative to the clock. Upon detecting such misalignment, embodiments of the invention automatically adjust the timing of the data signals relative to the clock signal to realign the intermediate data and the clock signal.
Some embodiments of the invention separate incoming data into two differently timed data signals. One such embodiment derives an intermediate signal and a late signal. The timing of the two data signals can be collectively moved with respect to the clock signal, or the two can be separated to center the intermediate data signal on the clock signal. Another such embodiment derives early and intermediate data signals, and can be used with embodiments that derive intermediate and late data signals to produce data windows centered on the intermediate data signals. Yet other embodiments employ two sequential storage elements and some control logic to selectively produce either early and intermediate data signals or intermediate and late data signals. The resulting early and late data signals are then used to synchronize the intermediate data with a clock signal.
In some embodiments, the sequential storage elements used to produce differently timed data are double-data-rate (DDR) flip-flops. One DDR flip-flop in an input block adapted in accordance with the invention includes three sequential storage elements. The first two storage elements capture data on alternate (rising and falling) clock edges; the third storage element enables the DDR flip-flop to produce a pair of DDR output signals both synchronized to the same type of clock edge (e.g., both signals are synchronized to rising clock edges).
This summary does not limit the invention, which is instead defined by the claims.