The present invention relates generally to high speed synchronous memory systems, and more particularly to controlling the read latency of memory devices so that read data from any memory device arrives at the memory controller at the same time.
An exemplary computer system is illustrated in FIG. 1. The computer system includes a processor 500, a memory subsystem 100p, and an expansion bus controller 510. The memory subsystem 100p and the expansion bus controller 510 are coupled to the processor 500 via a local bus 520. The expansion bus controller 510 is also coupled to at least one expansion bus 530, to which various peripheral devices 540-542 such as mass storage devices, keyboard, mouse, graphic adapters, and multimedia adapters may be attached.
The memory subsystem 100p includes a memory controller 400p and a plurality of memory modules 301p-302p which each include a plurality of memory devices, for example, DRAM-1 101p and DRAM-2 102p for memory module 301p and DRAM-3 103p and DRAM-4 104p for memory module 302p. Each memory device 101p-104p is a high speed synchronous memory device. Although only two memory modules 301p, 302p and associated signal lines 401ap, 401bp, 402ap, 402bp, 403p, 406p, 407p are shown in FIG. 1, it should be noted that any number of memory modules can be used. Similarly, although each memory module is illustrated as having only two memory devices 101p-102p, 103p-104p, the memory modules 301p-302p may have more or less memory devices 101p-104p, though a typical configuration may have eight or nine memory devices on each memory module. Signal lines 401ap, 401bp, 402ap, 402bp, and 403p, are known as the data bus 150p, while signal lines 406p and 407p are known as the command/address bus 151p. 
The data bus 150p includes a plurality of data signal lines 401ap, 401bp which is used to exchange data DATA between the memory controller 400p and the memory devices 101p-104p. Read data is output from the memory modules 301p, 302p and serially synchronized to a free running read clock signal RCLK on the read clock signal line 402ap, 402bp. The read clock signal RCLK is generated by the memory controller 400p and first driven to the farthest memory module 302p from the memory controller 400p before being driven through the remaining memory module(s) 301p to return to the memory controller 400p. Write data is output from the memory controller 400p and serially synchronized to a free running write clock signal WCLK on the write clock signal line 403p. The write clock is generated by the memory controller 400p and driven first to the closest memory module 301p before being driven through the remaining memory module(s) 302p. A plurality of command signal lines 406 is used by the memory controller 400p to send commands CMD to the memory modules 301p, 302p. Similarly, a plurality of address signal lines 407p are used by the memory controller to send addresses ADDR to the memory modules 301p, 302p. The data bus 150p or the command/address bus 151p may have additional signal lines which are well known in the art, for example chip select lines, which are not illustrated for simplicity. The commands CMD and addresses ADDR may also be buffered by an register (not shown) on the memory modules 301p, 302p before being distributed to the memory devices 101p-104p of a respective module. Each of the plurality of write clock signal lines 404p, the plurality of data signal lines 401a, 401b, the plurality of command signal lines 406, and the plurality of address signal lines 407 is terminated by a terminator 450, which may be a resistor.
When a memory device 101p-104p accepts a read command, data associated with that read command is not output on the data bus 150p until a certain amount of time has elapsed. This time is known as device read latency. Each memory device 101p-104p has an associated minimum device read latency but can also be operated at a plurality of greater read latencies. The amount of time which elapses between the time the memory controller 400p issues a read command and the time read data arrives at the memory controller 400p is known as system read latency. System read latency is equal to the sum of a memory device""s 101p-104p device read latency and the signal propagation time between the memory device 101p-104p and the memory controller 400p. Since memory module 301p is closer to the memory controller 400p than memory module 302p, the memory devices 101p, 102p located on memory module 301p have shorter signal propagation times than the memory devices 103p, 104p located on memory module 302p. At high clock frequencies (e.g., 300 MHz to at least 533 MHz), this difference in signal propagation time may become significant.
Due to differences in each memory device""s 101p-104p minimum read latency as well as the differences in signal propagation time of the read clock RCLK along the read clock signal lines 402ap, 402bp (e.g., data output from DRAM-3 103p takes longer to reach the memory controller 400p than data output from DRAM-1 101p because DRAM-3 103 is located farther away from the memory controller 400p than DRAM-1 101p), the memory devices coupled to the same read clock signal line (e.g., DRAM-1 101p and DRAM-3 103p) may have differing system read latencies. Forcing the memory controller 400p to process read transactions with a different system read latency for each memory device 101p-104p would make the memory controller 400p needlessly complex. Accordingly, there is a need for an apparatus and method to equalize the system read latency of the memory devices in order to reduce the complexity of the memory controller.
The present invention is directed at a method and apparatus for equalizing the system read latency of memory devices in a high speed memory subsystem. The present invention is directed at the use of a plurality of flag signals which controls the device read latency of each memory device. The flag signals are routed so that they have equivalent signal propagation times as the read clock signal. A memory device according to the present invention will begin to output data associated with a previously accepted read command at a predetermined number of read clock cycles after it receives the flag signal. Thus, the timing of the flag signal determines the device read latency of the memory device. A memory controller according to the present invention will perform a calibration routine during initialization. The calibration routine is used to determine the minimum timing offset required between the read command and the flag signal which will permit each memory device coupled to the same read clock signal line to reliably output read data, i.e., meet each device""s minimum device read latency. Alternatively, the minimum timing offset may be predetermined and stored on a memory (e.g., a serial presence detect or SPD EEPROM), thereby permitting the controller to set a timing offset without having to perform a calibration. The timing offset is used during normal operation to control when each memory device outputs read data. Since the flag signal has an equivalent signal propagation timing as the read clock path due to a similar path length and signal propagation characteristics, the signal propagation time of the flag signal automatically compensates for the difference in signal propagation times between the memory devices, thereby ensuring that the memory controller sees the same system read latency for each memory device coupled to the flag signal. In an alternate embodiment, the flag signals are local to each memory module and generated by a flag generation logic also located on the memory module. Under this system the flag signals are associated with the memory module and serve to equalize the latency of the memory devices of each memory module.