The present invention relates generally to accessing data from a synchronous data source and more particularly to compensating for skew in signal paths to and from a synchronous data source.
A major concern in the design and manufacture of electronic devices is the latency, or skew, between the transmission and reception of a signal resulting from the physical characteristics of the device and/or the transmission medium. Skew is of particular concern in the development of synchronized systems utilizing a common clock signal. Skew in the clock signal paths, the data signal paths, and/or the control signal paths in synchronized systems, if significant and/or variable, can result in an erroneous operation of the synchronous system. To illustrate, known synchronous memory access systems typically include a memory controller utilized to read data from a synchronous memory device, such as a synchronized dynamic random access memory (SDRAM). To access data from the memory matrix of the synchronous memory device, the memory controller typically provides address and control information generated by a command generation module to the memory matrix via a control path that generally includes an off-chip driver (OCD), an address/control interconnect, and an input receiver (IR). Likewise, a clock signal generated by a clock generator (denoted as an original clock signal) is provided via a clock signal path that typically includes an OCD, a clock interconnect, and an IR. Based on the address/control information, the memory device retrieves the requested stored data from the memory matrix and provides a data signal representative of the stored data to an input sampling module of the memory controller via a data path that generally includes an OCD, a data interconnect, and an IR. The input sampling module samples the data signal using the original clock signal to latch the stored data, processes the data as appropriate, and then provides the data to its appropriate destination.
Synchronous memory devices, such as the above memory device, often are implemented to minimize or avoid the overhead (e.g., handshaking and wait-states) common to asynchronous devices. By using a common clock signal and a predetermined memory access latency, the memory controllers generally can send a request for data to a memory device and then sample the output from the memory device after a predetermined number of clock cycles to obtain the stored data from the memory device. For example, assume that the time between the receipt of a read request by the memory device and the output of the requested data by the memory device is one clock cycle (this time herein referred to as the memory access latency). The input sampling module, knowing that the requested data will be on the data interconnect no later than one cycle after a read request is transmitted, can sample the data signal on the data interconnect at the next rising edge (i.e., one cycle later) to obtain the requested data from the memory device.
However, it will be appreciated that the skew in the signal paths of many such systems can be relatively large and/or can vary greatly compared to the clock signal period. In addition to on-chip delays, skew can result from, for example, the operation of off-chip drivers (OCDs) that are used to drive their respective signals on interconnects to input receivers (IRs). Further, skew may result from the transmission of electrical signals over the physical mediums (e.g., printed circuit board (PCB) traces) of the interconnects. Delays introduced by an OCD, an IR, and an interconnect are herein referred to as tOCD, tIR and tPCB, respectively.
Although the delays resulting from each of the OCDs, IRs, and interconnects sometimes are not large enough to individually affect the operation of the input sampling module of the memory controller, the sum of their delays often can result in timing problems, thereby resulting in sampling errors, even at relatively slow clocking frequencies. Likewise, although the variance in the delay of each of the OCDs, IRs, and interconnects may be relatively insignificant, the resulting combination of variances of the elements of a signal path also can result in sampling errors.
To illustrate, assume that the original clock signal is transmitted over the clock signal path (i.e., an OCD, a clock interconnect, and an IR) in the known system described above. Accordingly, the skew (tskew) between the original clock signal generated by the clock generator and the clock signal received by the memory matrix is substantially equivalent to the sum of the individual delays of the OCD, the IR, and the clock interconnect, or tskew=tOCD+tIR+tPBC. If this sum (tskew) has a possible range that exceeds the clock cycle time or varies considerably during operation, the input sampling module of the memory controller typically cannot reliably know at which clock edge to sample the data signal from the memory device.
Accordingly, the cycle time of the original clock signal often is enlarged to minimize or prevent unreliable signal sampling by the input sampling module. Different memory systems have different associated delays. For example, a DDR SDRAM memory has an internal delay lock loop (DLL) compensating for its IR and OCD delay. When using such memory, the uncompensated delays result from the OCD of the memory controller, the clock signal transmission time over the clock interconnect, data signal transmission time over the data interconnect, and the memory controller's IR delay. For such systems, when using a single synchronous clock, the clock cycle time (T) must be greater than the sum of the above elements (i.e., T>tOCD+tIR+2tPCB). Of course, in order to guarantee reliable operation other factors should also be taken into account when doing the time budgeting, such as clock period uncertainty, PCB related jitter, setup requirements, etc.
Another important factor to be considered is the difference between clock and data loading. A clock signal typically goes to all memory devices. The data lines are typically less loaded than the clock line, because several thin memory components can be used to assemble a wide memory system. In high-speed memories, the data lines are usually point-to-point, whereas the clock net is loaded by several memory devices. Therefore, the uncertainty in the OCD delay and interconnect transmission line for the data lines typically is smaller than the uncertainty for the clock line (also for the control lines). For stub series terminated logic (SSTL) class II drivers (one example of an OCD), the tOCD typically can vary between 0.5 nanosecond (nS) and 3 nS, whereas the tIR value for SSTL class II inputs (one example of an IR) can range from 0.1 nS to 0.9 nS. Likewise, PCB traces (one example of an interconnect) typically have a tPCB value that ranges between 75 picoseconds (pS) and 500 pS for lengths from one-half inch to three inches. Using these exemplary values, the value of (tOCD+tIR+2tPCB) can vary from between 0.75 nS to 4.9 nS. Adding additional factors that contribute to timing uncertainty such as setup requirements, clock jitter, etc., having a total of 1 nS, the minimum cycle time for the original clock signal, in this example, is at least 5.9 nS, corresponding to a maximum clock frequency of about 170 megahertz (MHz). Since the system clock frequency of many types of computing systems, such as communication and graphics systems, exceeds this 170 MHz limit, a different approach to using synchronous memories is desirable.
Accordingly, mechanisms attempting to compensate for the effects of skew while preserving high clock frequencies have been developed, a number of which are discussed below. One known implementation for minimizing the effects of skew utilizes a clock generator that is external to both the memory device and the memory controller. In this case, the clock generator provides an original clock signal to both the memory device and the memory controller. In the event that the delay in the signal path between the clock generator and the memory controller is substantially equivalent to the delay in the signal path between the clock generator and the memory controller, the memory device and the memory controller typically would be synchronized to the same clock signal, effectively negating the OCD, IR, and interconnect delays of the clock signal path. However, these same delays exist in the address/control path and the read data path, thereby causing timing errors at the memory device due to the relative skew between the address/control/read signals and the common clock signal. As a result, uncertainty is introduced in the sampling of the control and/or address signals by the memory device as a result of varying or unknown skew in the control/address signals. Likewise, variable and/or unknown skew in the read data path between the memory device and the memory controller also introduces uncertainty in the sampling of read data from the memory device by the memory controller.
Another known system for minimizing the effects of skew implements a phase lock loop (PLL) and an additional IR. In this known implementation, the original clock signal from the clock generator of the memory controller is provided to the PLL as a reference clock signal and the output clock signal to the memory is supplied as feedback to the PLL via a signal path that simulates the clock signal path between the memory controller and the memory device.
Using this feedback clock signal, the PLL typically attempts to ensure that the phase of the skewed clock signal used by the memory device and the original clock signal used by the input sampling module are the same by providing a corrected clock signal, effectively negating the OCD, IR, and interconnect delays for the clock signal path. However, as with the above known implementation, delays still exist in the address/control path and the read data path, thereby causing timing errors at the memory device due to the relative skew between the address/control/read signals and the corrected clock signal. Advancing the clock frequency by the OCD, IR, and interconnect delays to compensate for the relative skew typically fails since the sampling timing at the input of the memory device then is misadjusted, as are the address and control signals, resulting in uncertainty in sampling of the address and/or control signals.
Yet another known implementation utilizes resampling in an attempt to overcome problems resulting from clock skew. In this case, the memory controller of the known system typically includes a resampling module to resample the output from the input sampling module of the memory controller using the original clock signal. As with the above known implementation, an additional IR generally is utilized to provide a skewed clock signal to the input sampling module that is substantially equivalent to the skewed clock signal received by the memory device. Accordingly, the input sampling module can more reliably sample the data signal provided by the memory device using the skewed clock signal (disregarding the delays introduced over the data signal path).
While the data can be more reliably sampled by the input sampling module using the skewed clock signal, it will be appreciated, however, that the data is sampled in the skewed clock domain rather than in the original clock domain used by the remainder of the memory controller. Accordingly, known solutions implement the resampling module to resample the once-sampled data to convert the data from the skewed clock domain to the original clock domain using the original clock signal as the timing reference. While this generally is successful when the signal delays and/or variance are relatively minor, the resampling module experiences the same unreliable sampling issues when the control signal delays are as great or greater than the time period of the clock signal due to the inherent delays in the control signal path, since it may be unclear as to which read command a set of sampled data is associated. When this delay is relatively significant, it can affect the timing of the resampling module and, therefore, cause any resampled data to be unreliable.
Accordingly, a system and a method for improved synchronization during a data access from a synchronous data source would be advantageous.