As computers and computer processors increase in performance, memory access performance becomes a significant factor affecting overall system performance. If an interface that communicates data between a memory device and a memory controller or other application device operates more slowly than a processor can use data, the interface can reduce the data processing capacity of the entire computer. For dynamic random access memory (DRAM) devices, which are commonly used as the main working memory for a computer, various interconnect technologies have been developed over the years. One such interconnect technology is used for synchronous DRAMs, or SDRAMs, which utilize a source synchronous interface, where the source of data during a data transfer is relied upon to provide a data strobe signal (DQS) that is used by the target of the data transfer to capture such data as it is being transferred over a data line to the target. In particular, the capture of data on a data line is typically latched by the rising or falling edge of the DQS signals, for example, so that the value transmitted on a data line when the data strobe signal transitions from low to high, or vice-versa, will be latched into a data latch in the target.
DRAM memory elements, such as double data rate (DDR) memory elements, contain multiple buses. A command and address bus is formed by a number of signals, such as, for example, a column-address strobe (CAS), row-address strobe (RAS), write enable (WE), clock enable (CKE), chip-select (CS), address (ADDR), bank address (BA) signals, and differential clock signals (CK and CKN). DDR3 memory elements operate with differential data strobe signals DQS and DQSN, which enable source synchronous data capture at twice the clock frequency. The data bus contains the data signals (DQ), and the source synchronous data strobe signals. Data is registered with the rising edges of both DQS and DQSN.
DDR3 data is transferred in bursts for both read and write operations, sending or receiving a series of four (referred to as burst chop 4 or BC4) or eight (referred to as burst length 8 or BL8) data words with each memory access. For read operations, data bursts of various lengths are transmitted by the DRAM device edge-aligned with a data strobe signal. For write operations, data bursts of various lengths are received by the DRAM element with a 90-degree phase-delayed data strobe signal. The data strobe signal is a bidirectional signal used to capture data. After the data is captured in the source-synchronous data strobe domain, the data must be transferred into a local clock domain.
For dual in-line memory modules or DIMMs, the DDR3 memory specification includes what is commonly known as a “flyby” topology for clock, address and control connections that are shared between all of the DRAM devices on the DIMM. As opposed to the balanced tree arrangement used in DDR2 memory, which provides clock, address and control signal lines of approximately the same length to each memory element in a memory module at the expense of signal integrity, the flyby topology is arranged to promote signal integrity and results in clock, address and control signal lines of different lengths for each device within the module. Consequently, the timing relationships between the data, data strobe and clock signals can vary undesirably from one memory element in the DIMM to another. In some instances, the difference between the lengths of two DQS lines relative to a common clock signal line may be more than a single clock period. Since the DDR3 SDRAM devices require a specific timing relationship between the data, data strobe and clock at the respective DRAM pins, the DDR3 specification supports an independent timing calibration procedure known as “write leveling” or “write level training” for each source synchronous group.
Write leveling allows the host logic circuitry that initiates the data transfers (e.g., a core logic portion of an ASIC) to configure the interface that communicates data between the memory controller and the target DRAM to delay the data strobe signal and data signals by a configurable or controllable amount of time. A free-running clock signal (CK) is propagated from the host logic circuitry to an input of a dedicated internal calibration register in a target DRAM. In the write leveling procedure, a single data strobe signal (DQS) pulse is propagated from the host logic circuitry to the dedicated internal calibration register in the DRAM. In response to this single DQS signal pulse, the dedicated internal calibration register outputs a signal that indicates the phase alignment between the clock signal and the data strobe signal rising edge at the DQ pin of the DRAM. This output is propagated back to the host on one or more of the data signal (DQ) lines to the memory controller. The calibration is repeated with different delay values until an optimum write leveling delay is determined. The host logic circuitry can then configure the interface to use the optimized write leveling delay for subsequent write operations from the memory controller to the target DRAM.
The value of the optimum write leveling delay is determined in the manner described above for each source synchronous group. However, in an instance in which the write-leveling logic is capable of delaying the data strobe signal by as much as two full clock cycles, it cannot be determined by examining the optimum write leveling delay value alone whether, during a write operation initiated by the host logic circuitry, the data strobe signal of each source synchronous group will be aligned with the same rising edge of the clock signal as the data strobe signals of other source synchronous groups. That is, it cannot be determined from examining the write leveling delay value alone whether applying the write leveling delay to the data strobe signal would align the data strobe signal with the first rising edge of the clock signal following that data strobe signal or the second rising edge of the clock signal following that data strobe signal. For correct operation, the data strobe signals of all source synchronous groups must be aligned to the same edge of the clock signal relative to the initiation of those data strobe signals.
In an instance in which the write-leveling logic is capable of delaying the data strobe signal by as much as two full clock cycles, it is desirable to determine to which of the two rising clock cycle edges each data strobe signal would be aligned as a result of applying the write leveling delay determined by the write-leveling procedure. If the results of the write leveling procedure indicate that the write leveling delay to be applied to the data strobe signal of one source synchronous group would align that data strobe signal with the first rising edge of the clock signal but indicates that the write leveling delay to be applied to the data strobe signal of another source synchronous group would align that data strobe signal with the second rising edge of the clock signal, then the write-leveling logic can adjust (i.e., advance or retard) one of the two write leveling delays by one clock cycle, so that both data strobe signals are aligned with the same edge of the clock signal.
A prior method for determining to which of the two rising clock cycle edges each data strobe signal is aligned following write leveling involved a system designer using a priori knowledge of the relative routing of the data strobe signals and clock signals to the various source synchronous groups and guessing to which of the two successive clock cycle edges the data strobe signal is more likely aligned. Write-read-compare operations were then performed to verify that the correct clock signal edge had been identified, and adjustments were made in a trial-and-error manner until error-free results in the write-read-compare operations indicated that data strobe signals of all source synchronous groups had been aligned with the same clock edge. It would be desirable to provide a more efficient and deterministic method and system for identifying to which of two rising clock cycle edges each data strobe signal is aligned following write leveling.