1. Field of the Invention
The invention is related to source synchronous techniques, and in particular to the addition of delay circuitry to a chip that uses source synchronous techniques to improve testability of the chip.
2. Description of the Related Art
Strobe signals are clock signals that are transmitted with data signals, either simultaneously or after a predetermined delay. The strobe signal is used to time-synchronize data appearing as input signals at a receiver from a driver (transmitter). The use of the strobe signal to indicate when data should be sampled avoids using a clock which is sent to both driver and receiver. If this latter technique is used, then the skew between the two versions of the clock (transmitter and receiver) must be added to the time that each bit is driven from the driver, slowing it down. Sending the clock along with the data eliminates this skew by using the transmitter's clock both to send the data and to send the strobe. Data transfers are referred to as source synchronous when the clock (or strobe) signal that latches the data is supplied by the same chip (a driver) that is driving the data. With source synchronous data transfers, the same process, temperature, and voltage variations affect both the data and clock timings, and a multi-chip system does not need additional timing margin to account for independent variation in these variables along the clock and data paths.
Source synchronous I/O techniques permit very high bandwidth per chip pin. Usage of these techniques, however, is limited because of the difficulty in testing such circuits. Source synchronous circuits are difficult to test because: (1) they operate very fast, requiring great precision in tester edge (a high-to-low or low-to-high transition in a digital signal) placement accuracy; and (2) critical output timings are measured from one output pin to another, rather than from a clock input pin to an output pin.
The need for high precision edge placement leads to the use of very expensive testers. Testers are designed to place and measure edges with respect to a clock signal that the tester provides to the chip being tested. This restriction greatly simplifies the design of the tester, but also greatly complicates measuring the timing parameters that are critical to source synchronous outputs. For a centered clocking driver, these parameters include the time after the strobe which data is valid (T.sub.va) and the time before the strobe which data is valid (T.sub.vb). T.sub.vb is to be compared to the setup time T.sub.setup (required short time of stability before an active clock edge) and T.sub.va is to be compared to the hold time T.sub.hold (required short time of stability after the active clock edge) of the receiver. Large values are desirable for T.sub.va and T.sub.vb, which are related to the minimum and maximum output delay.
Source synchronous data transfers may be effected with either a coincident clocking signal 10 or a centered clocking signal 12, as shown in FIG. 1. For both, it is desirable for a data signal 16 to be strobed in the centers 20 of their respective valid windows or cells 22. In other words, it is desirable to have the rising and falling edges of a strobe signal be time coincident with the centers 20. Both edges 14a and 14b of a strobe 14 are used to sample the data 16 in adjacent cells 20, as illustrated in FIG. 1. To do this with coincident clocking 10, the receiver delays the incoming strobe 14 by one-quarter of a clock (strobe) cycle to properly latch (sample) the data 16 being received. On the other hand, with centered clocking 12, the driver (not shown) needs to delay an outgoing strobe 18 by a quarter of a clock (strobe) cycle to sample with both edges 18a and 18b. In FIG. 1, the strobe 18 is shown already delayed by a quarter clock cycle. Coincident clocking 10 thus offers better driver power supply noise correlation between the strobe 14 and the data 16, while centered clocking 12 allows for a much simpler receiver.
Referring to FIG. 2a, a coincident clocking transmitter and receiver system 8, which may be located on a semiconductor device, is shown. A transmitter 33 is a simple circuit, and the inclusion of a delay-locked loop (DLL) 32 in a receiver 30 compensates for a distribution delay (for instance, with an RC circuit) to a flip-flop 34 or a plurality of such flip-flops (i.e., because an RC distribution network 38 is included in the DLL 32 in the receiver 30) at the same time that it generates a 90.degree. phase shift of the incoming strobe signal 14, as will be discussed below. On the other hand, FIG. 2b shows a centered clocking transmitter and receiver system 9, which may also be located on a semiconductor device. In the system 9, a receiver 31 is a simple circuit, but, in contrast to the system 8, the RC distribution delay for propagating a strobe signal 40' (.apprxeq.14') within the receiving chip 31 is not compensated because an RC distributed network 38' (like the one in the receiver 31) is not included in the DLL 32' in the transmitter 35.
The DLL 32 having the RC distribution network 38 in the receiver 30 and the DLL 32' not having the RC distribution network 38' in the transmitter 35 contribute, among other factors, to differences between analogous signals 50a and 50a', 50b and 50b', 14 and 14', and 40 and 40' within the respective DLL's 32 and 32', although both systems 8 and 9 will produce ideally substantially the same signal 40.
Referring again to FIG. 2a, the DLL 32 is coupled to a clock input port 37 of a latch 34, for example, a flip flop (FF), and controls latching of the data signal 16, which is input to the latch 34 through a data input port 39 of the latch 34. The solid dots above and below the latch 34 indicate that there may be more than one latch 34 coupled to the DLL 32 (and 32') to receive delayed strobe signals, as will be discussed below. (Similarly, solid dots above and below a latch 41 in the transmitter 33 (and 35) indicate that there may be more than one FF 41 coupled to the receiver 30 [and 31]).
Although most of the following discussion is framed in terms of the coincident clocking signal 10, it should be understood that the concepts involved apply equally well to driver circuits (e.g., the transmitter 35 in FIG. 2b) that are used to generate the centered clocking signal 12 for source synchronous data transfer. The DLL 32 receives as an input signal the strobe signal 14 and includes a delay line 36, the distributed RC network 38, another distributed RC network 44, a delay line 42, a phase detector (PD) 48, and a filter (e.g., an RC low pass filter) 52, as shown in FIG. 2a. The distributed RC network 44 is built to approximately match the distributed RC network 38. Likewise, the delay line 42 is built to approximately match the delay line 36. This is done to have the delay from signals 14 to 40 be approximately the same as the delay from signals 40 to 46. A disadvantage to the centered clocking approach is that the delay across the network 38' cannot be fully compensated by just including a network like the network 44 in the driver 35, because the RC distribution network 38' and the network like the network 44 would no longer be in the same chip, subject to the same process, voltage, and temperature variations. The DLL 32 is used to delay the strobe (or clock) 14 (i.e., the edges 14a and 14b) to provide a centered clock similar to the centered strobe signal 12 (see FIG. 1). The delay will enable the data 16 to be sampled (latched) by the latch 34 in the centers 20 of their respective valid windows 22. This ensures optimum (i.e., short) setup and hold times for the receiving latch 34.
In an alternative implementation (not shown) having no DLL, it is possible to use a falling clock edge to drive the strobe signal from a transmitter. This alternative centered clocking driver implementation is compatible with the receiver in the system 9. Elimination of the DLL entirely, however, reduces the controllability of timing for source synchronous I/O techniques, and adds a dependency on clock duty cycle.
Referring to the operation of the DLL 32 (FIG. 2a) in more detail, the DLL 32 receives the strobe signal 14, delays it through the delay line 36, and then delivers (distributes) it to the data latch 34 through the distributed RC network 38. Distribution by the network 38 results in a delayed signal DlyStb 40, which is used to clock the data latch 34. The DlyStb signal 40 is fed back through the delay line 42 and the distributed RC network 44, which is similar to the distributed RC network 38 used in distributing the DlyStb signal 40. Although the networks 38 and 44 are discussed herein as distributed RC networks, it is understood that they could be any network that propagates a signal from an input to a plurality of outputs like latches 34 with a predictable delay, and may include active elements. The output signal of the delay line 42 is delivered (distributed) by the network 44 and results in a feedback strobe (FbStb) signal 46, which is input to the PD 48 for comparison to the original strobe 14, which is also input to the PD 48. If FbStb 46 arrives at the PD 48 before the next edge (e.g., the next edge 14a in FIG. 1) of the strobe signal 14, then the PD 48 outputs one value (e.g., low) of a voltage 50a, which is then filtered by the RC low pass filter 52 to a control voltage V.sub.cntl 50b. The V.sub.cntl 50b is then used to slow down (increase the delay in) the delay lines 36 and 42. On the other hand, if FbStb 46 arrives at the PD 48 after the next edge (e.g., the next edge 14a) of the strobe 14, then the PD 48 outputs another value (e.g., high) of the voltage 50a, which is filtered to a new value of the control voltage V.sub.cntl 50b. The new value of the V.sub.cntl 50b then reduces the delay of the delay lines 36 and 42.
By driving the delay lines 36 and 42 with V.sub.cntl 50b in such a way that the FbStb 46 and strobe 14 signals are aligned in time, the DlyStb signal 40 (i.e., its rising edge) will be positioned in time halfway between the edges (e.g., the edges 14a and 14b in FIG. 1) of the strobe signal 14 due to the matching circuitry (i.e., delay lines and RC distributed networks 36, 38 and 42, 44) used to generate the DlyStb 40 and the FbStb 46 signals. This ensures that the data latch 34 is sampled in the centers 20 of the data valid windows 22 (see FIG. 1), independent of process, temperature, and low frequency voltage changes. Under some circumstances, however, it may be desirable to start with the edges of the DlyStb 40 centered (nominally) and walk to an earlier time point relative to the centers 20 of the valid windows 22 by appropriate adjustment in the value of the V.sub.cntl 50b. Such circumstances, which move the DlyStb 40 signal around, may include: (1) for testing purposes, determining how far the DlyStb 40 edges can be moved, either toward the leading or falling edge of the window 22, before system breakdown or failure occurs; and (2) for debugging purposes, if setup time is longer than hold time, an adjustment could conceivably be made in which the DlyStb 40 edge is positioned toward the back edge of the window 22.
Although edge placement may be possible using the conventional techniques set forth above, these techniques may suffer from not offering enough controllability of the delay lines and complications associated with measuring timing parameters that are critical to source synchronous inputs. Therefore, a technique that provides additional delay line controllability and reduces difficulties associated with testing chips used in source synchronous data transfer would be useful.