1. Field of the Invention
The invention relates generally to the clock-controlled transmission of data and relates especially to a circuit arrangement for regulating the data transmission latency. A preferred, but not exclusive, area of application of the invention is the transmission of memory information (which has been read), within a memory module, from the output buffer of a memory bank to the data output of the module.
2. Description of the Related Art
In data processing systems, a reference clock is normally used as a time standard to control the operations. Accordingly, the time markers which are set in a control device to coordinate the operating cycle are not defined as units of absolute time (for example micro- or nanoseconds) but rather as units of the reference clock, that is to say, as a number of clock periods. Variations in the clock frequency can thus be allowed without having to change the specifications of the control device. These specifications also include the stipulation of the so-called “latency” for the operation of transmitting data over a data path. This latency is prescribed as a whole number “n” clock periods which are intended to elapse, as of the time of a transmission command, before the data item to be transmitted appears at the end of the data path.
However, it must be taken into account that virtually every data path contains elements which, for physical reasons, give rise to an inevitable and essentially fixed “absolute” time delay (delay time) of the transmitted data. These include, for example, electrical components with inevitable response and transfer times (for instance, inverters and amplifiers) and delaying transmission lines. The sum “τf” of these fixed absolute delay times in the data path determines the lower limit nmin for selecting the latency n (defined above) for a given period duration T of the reference clock, since the product n*T must not be less than τf.τf≦n*T  Eq. (1)
If, in contrast, the latency n has been prescribed, the absolute time value τf determines the lower limit Tmin for the period duration T of the reference clock (and thus the upper limit fcmax=1/Tmin for selecting the reference clock frequency fc).
To comply with a desired latency n in an accurate manner, a period of time τg that is exactly the same as a whole number n of clock periods T of the clock frequency fc must elapse between a reference time t0, at which the transmission of a data item (or of a burst of successive data) is requested, and the time tn, at which the data item appears at the output end of the data path. The following requirement thus exists.tn−t0=Tg=n*T  Eq. (2)
Of course, it cannot be assumed that the fixed delay time τf of the data path corresponds exactly to the product n*T. Rather, the delay time τf will be some fraction of n*T, which fraction changes with the value T (that is to say, in a manner dependent on the clock frequency fc). It must therefore be ensured that the data experience an additional delay that is dependent on T to compensate for the difference between n*T and τf. With reference to FIGS. 1 and 2, the text below shall first of all describe how this has hitherto been achieved in accordance with the prior art.
At the top left, FIG. 1 shows the source 10 of the data to be transmitted, which source is, for example, the data buffer at the data connections of a memory bank in a DRAM module that is integrated on a semiconductor chip. As is known, the acronym “DRAM” denotes a dynamic random access memory. In the read mode, the data which have been read from the respectively addressed memory cells are passed to this buffer 10 to be transmitted, on demand, from the output of the buffer to a contact piece 50. This contact piece (usually referred to as a “pad”) is wired to an associated external data connection of the chip (not shown).
The pad 50 forms the end of a data path that begins at the output of the data source 10 and contains transmission elements which delay the data by a respective fixed time. In this example of data transmission from the read data buffer 10 of a memory bank to the pad 50, these transmission elements having a fixed delay time are, for example, various branches of a system of bus lines, elements of a data control logic unit for directing the data via selected branches of the bus system and, as the last element before the pad 50, a transmission driver (off-chip driver OCD) for amplifying the data before they are transmitted to the external data connection of the chip via the pad 50. Data amplifiers which respectively likewise give rise to a fixed delay may also be provided between individual branches or sections of the bus system. In addition to these elements (which are not shown individually in FIG. 1 but rather summarized in the form of a block 40), a multiplexer 30 that likewise causes a fixed delay and plays a special role in latency regulation, as will be described further below, is also provided in accordance with FIG. 1. All of these elements mentioned together give rise to the abovementioned fixed delay time τf.
Depending on the value n selected and depending on the clock frequency fc=1/T used, the fixed delay time τf may be less than T or equal to T or greater than T. To bring the total delay τg from the data source 10 to the pad 50 precisely to the value n*T, the prior art introduces an additional delay that is composed of a first part p*T corresponding to an integer multiple p of the clock period T and of a second part q*T corresponding to a fraction q of the clock period T, in accordance with the following equations,p=INT(n−τf/T)≧0  Eq. (3)q=(n−τf/T)−INT(n−τf/T)  Eq. (4)where INT denotes the integer part of the argument placed in brackets after it.
The part p*T of the additional delay is introduced by inserting a suitable number of shift register stages at the start of the data path, said shift register stages being clock-controlled at the frequency fc of the reference clock. The part q*T of the additional delay is introduced by delaying the phase of the clock control of the shift register stages by an appropriate degree with respect to the reference clock.
In accordance with FIG. 1, the reference clock signal CLK(0) is used to derive a shifted version CLK(0−τf) that has been time-shifted in the negative direction by precisely the fixed amount of time τf with respect to the original signal. This is effected using a DLL (“delay locked loop”) 60 whose feedback path contains a simulation 70 of the chain of all the elements (including the multiplexer 30) which together give rise to the fixed absolute delay τf. The components of this simulation may be real copies of the relevant elements of the data path or may be equivalent circuits having an equivalent fixed delay time (for reasons of space, the latter is practiced, in particular, for the simulation of bus line lengths). Since inaccuracies (for example, variations from chip to chip) can occur when forming these copies or equivalent circuits, the components are, in practice, constructed in such a manner that they form a section 71 whose delay is definitely somewhat less than τf and which has an adjustable delay element 72 connected downstream to adjust the total delay precisely to the value τf.
The integer part p of the difference between n and τf/T is ascertained, in a latency control logic unit 80, in accordance with Eq. (3) above by comparing the two clock signals CLK(0) and CLK(0−τf) and taking into account the desired latency n.
In addition, the shifted reference clock signal CLK(0−τf) is used to clock a shift register 20. This has the effect of the data along the shift register being clocked at the frequency fc=1/T of the reference clock but with a clock phase that effectively appears to be delayed by the fraction q (defined in Eq. (4)) of the clock period T with respect to the phase of the reference clock CLK(0).
The shift register 20 is shown in FIG. 1 in the form of a chain of successive D flip-flops (data flip-flops) FF#1, FF#2, etc. The input of the first stage FF#1 is directly connected to the output of the data source 10, and a tap is located at the output of each stage. The taps lead to the inputs of the multiplexer 30. In a manner dependent on the value p ascertained, the latency control logic unit 80 controls the multiplexer 30 in such a manner that the latter selects the (p+1)-th tap of the shift register 20 to insert the chain of the first to (p+1)-th stage of the register into the data path.
The data which have been transmitted thus experience a total delay of:τg=q*T+p*T+τf  Eq. (5)
If the values for p and q from Eq. (3) and Eq. (4) are inserted into Eq. (5), Eq. (2) above is arrived at exactly. The imposed requirement is thus satisfied.
The method of operation of the circuit arrangement shown in FIG. 1 will be explained in more detail below with reference to the timing diagrams (a) to (c) in FIG. 2, to be precise for three different clock frequencies fc. All three diagrams (a) to (c) apply to the example n=4; that is to say, the period from the time t0 of the transmission command to the time tn of the arrival of the first data item in a data burst (to be transmitted) at the pad 50 is intended to be exactly four clock periods. In each diagram, the reference clock signal CLK(0) and the shifted clock signal CLK(0−τf) are shown on the same time axis. The rising edges of the clock signals shall be the “active” edges and respectively mark the beginning of a clock period of duration T=1/fc. In the diagrams, these clock edges are marked on the reference clock signal CLK(0) with a respective small arrow and a serial number next to the arrow.
FIG. 2A illustrates the case in which the clock frequency fc has a value for which the fixed delay time τf of the data path is considerably shorter than one clock period T, namely 0.833 T. Accordingly, the DLL 60 (shown in FIG. 1) causes the clock signal CLK(0−τf) to appear in a manner such that it has been shifted to the left by 0.833 T with respect to the reference clock signal CLK(0). The latency control logic unit 80 ascertains the integer value p=3 in accordance with Eq. (3) for n=4 and τf/T=0.833 T. The logic unit 80 thus uses the multiplexer 30 to select the tap at the output of the fourth stage FF#4 of the shift register 20. The value 0.167 is obtained for the fraction q in accordance with Eq. (4).
The transmission command RDD is issued at the time to in synchronism with an edge of the reference clock CLK(0) and ensures that, as of this time, a connection is set up between the reference clock signal CLK(0) and a clock input of the data source 10, and a connection is set up between the shifted clock signal CLK(0−τf) and the clock connections of the shift register 20. This is symbolized in FIG. 1 by an AND gate 11 in the clock supply line to the data source and an AND gate 21 in the clock supply line to the register 20. At the time to, the first data item reaches the input of the first register stage FF#1 and is transmitted to the output of this stage with the next active edge of the clock signal CLK(0−τf). This edge appears a period of time q*T=0.167 T later than to. After a further p=3 clock periods, the data item appears at the “selected” tap at the output of the register stage FF#4 and then appears, after the further delay τf, at the pad 50 at the time tn. The total delay time desired thus results from t0 to tn:τg=0.167T+3*T+0.833*T=4*T.
FIG. 2B illustrates the case in which the clock frequency fc is twice as high as in the case of the diagram in FIG. 2A. The fixed delay time τf of the data path is now considerably longer than one clock period T, namely 1.666 T. Accordingly, the DLL 60 (in FIG. 1) causes the clock signal CLK(0−τf) to appear in a manner such that it has been shifted to the left by 1.666 T with respect to the reference clock signal CLK(0). The latency control logic unit 80 ascertains the integer value p=2 in accordance with Eq. (3) for n=4 and τf/T=1.666 T. The logic unit 80 thus uses the multiplexer 30 to select the tap at the output of the third stage FF#3 of the shift register 20. The value 0.333 is obtained for the fraction q in accordance with Eq. (4). In this case, too, the desired total delay time results, but in a different distribution:τg=0.333T+2*T+1.666*T=4*T.
The known circuit arrangement can be used to theoretically achieve a desired total delay time of exactly n*T given any desired values for T, n and τf, provided that T*n is not less than τf. However, one critical point in the known latency regulation method described is the latency control logic unit that ascertains the integer p in accordance with Eq. (3) above. This logic unit has decision-making problems when the quotient τf/T is equal to a whole number or comes very close to a whole number. Such a situation arises whenever the clock frequency fc has a value for which the clock period T=1/fc is equal to τf or an integer multiple thereof or comes very close to such values.
FIG. 2C illustrates this problem for the example of a frequency fc at which τf is only very slightly less than 1*T. In this case, the slightest instabilities in the clock frequency and/or in the elements of the latency control logic unit may cause τf/T to sometimes be assessed as being <1 and to sometimes be assessed as being >1. There is then the risk of the value of p that has been ascertained jumping in an undesired manner (between p=3 and p=2 in the case shown) which, for its part, leads to the latency temporarily jumping away from the desired value n. To reduce this risk, conventional latency control logic units contain complicated circuits to shift the supplied clock signals back and forth and thus incorporate hysteresis. However, practice has shown that, despite these measures, it has hitherto not been possible to completely eliminate the risk of latency jumps.