Processor-based systems, such as computer systems, use memory devices, such as dynamic random access memory (“DRAM”) devices, to store instructions and data that are accessed by a processor. These memory devices are typically used as system memory in a computer system. In a typical computer system, the processor communicates with the system memory through a processor bus and a memory controller. The processor issues a memory request, which includes a memory command, such as a read command, and an address designating the location from which data or instructions are to be read. The memory controller uses the command and address to generate appropriate command signals as well as row and column addresses, which are applied to the system memory. In response to the commands and addresses, data is transferred between the system memory and the processor. The memory controller is often part of a system controller, which also includes bus bridge circuitry for coupling the processor bus to an expansion bus, such as a PCI bus.
Although the operating speed of memory devices has continuously increased, this increase in operating speed has not kept pace with increases in the operating speed of processors. Even slower has been the increase in operating speed of memory controllers coupling processors to memory devices. The relatively slow speed of memory controllers and memory devices limits the data bandwidth between the processor and the memory devices.
In addition to the limited bandwidth between processors and memory devices, the performance of computer systems is also limited by latency problems that increase the time required to read data from system memory devices. More specifically, when a memory device read command is coupled to a system memory device, such as a synchronous DRAM (“SDRAM”) device, the read data are output from the SDRAM device only after a delay of several clock periods. Therefore, although SDRAM devices can synchronously output burst data at a high data rate, the delay in initially providing the data can significantly slow the operating speed of a computer system using such SDRAM devices.
One approach to alleviating the memory latency problem is to use multiple memory devices coupled to the processor through a memory hub. In a memory hub architecture, a system controller or memory hub controller is coupled to several memory modules, each of which includes a memory hub coupled to several memory devices. The memory hub efficiently routes memory requests and responses between the controller and the memory devices. Computer systems employing this architecture can have a higher bandwidth because a processor can access one memory module while another memory module is responding to a prior memory access. For example, the processor can output write data to one of the memory modules in the system while another memory module in the system is preparing to provide read data to the processor. The operating efficiency of computer systems using a memory hub architecture can make it more practical to vastly increase data bandwidth of a memory system. A memory hub architecture can also provide greatly increased memory capacity in computer systems.
Although there are advantages to utilizing a memory hub for accessing memory devices, the design of the hub memory system, and more generally, computer systems including such a memory hub architecture, becomes increasingly difficult. For example, in many hub based memory systems, the processor is coupled through a memory hub controller to each of several memory hubs via a high speed bus or link over which signals, such as command, address, or data signals, are transferred at a very high rate. The memory hubs are, in turn, coupled to several memory devices via buses that must also operate at a very high speed. However, as transfer rates increase, the time for which a signal represents valid information is decreasing. As commonly referenced by those ordinarily skilled in the art, the window or “eye” for when the signals are valid decreases at higher transfer rates. With specific reference to data signals, the “data eye” decreases. As understood by one skilled in the art, the data eye for each of the data signals defines the actual duration that each signal is valid after various factors affecting the signal are considered, such as timing skew, voltage and current drive capability, and the like. In the case of timing skew of signals, it often arises from a variety of timing errors such as loading on the lines of the bus and the physical lengths of such lines.
As data eyes of signals decrease at higher transfer rates, it is possible that one or more of a groups of signals provided by a memory device in parallel will have different arrival times at a memory hub to which the memory devices are coupled. As a result, not all of the signals will be simultaneously valid at the memory hub, thus preventing the memory hub from successfully capturing the signals. For example, where a plurality of signals are provided in parallel over a bus, the data eye of one or more of the particular signals do not overlap with the data eyes of the other signals. In this situation, the signals having non-overlapping data eyes are not valid at the same time as the rest of the signals, and consequently, cannot be successfully captured by the memory hub. Clearly, as those ordinarily skilled in the art will recognize, the previously described situation is unacceptable.
One approach to alleviating timing problems in memory devices is to use a delay-locked loop (DLL) or delay line (DL) to lock or align the receipt of read data from a memory device to a capture strobe signal used to latch the read data in a memory hub. More specifically, a read strobe signal is output by the memory devices along with read data signals. At higher transfer rates, the timing of the read strobe signal can vary so that it cannot be reliably used to capture the read data signals in the memory hub. Further, even if the read data strobe could reliably capture the read data signals in the memory hub, the time at which the read data signals were captured could vary in relation to a core clock domain used to control the operation of the memory hub that is coupled to the memory device. In such case, the read data may not be present in the memory hub at the proper time. To alleviate this problem, the timing of the read data strobe signals is adjusted using the DLL or DL to generate a capture clock signal that can reliably capture the read data signals. The DLL or DL is thus effective in preventing substantial drifting of a read data eye in relation to the core clock domain. As transfer rates increase, however, the timing specifications for the DLL or DL become more stringent and therefore increasingly difficult to meet. Furthermore, the amount of circuitry required to implement a suitable DLL or DL can materially reduce the amount of space that could otherwise be used for memory device circuitry, thereby either increasing the cost or reducing the storage capacity of such memory devices.
There is accordingly a need for a system and method that avoids the need to precisely control the timing relationships between a memory hub clock domain and the receipt of read data signals at the memory hub in a manner that avoids the need for extensive DLL or DL circuitry.