1. Field of the Invention
Embodiments of the present invention relate generally to memory systems and memory modules. More particularly, embodiments of the invention relate to registered memory modules adapted to operate at multiple operating frequencies, and yet maintain appropriate setup and hold times for control/address signals regardless of operating frequency.
2. Description of the Related Art
Conventional host systems, such as servers, personal computers (PCs), notebook PCs, and Personal Digital Assistants (PDAs) often include a plurality of memory devices operationally grouped together in one or more memory modules. Memory modules are typically implemented on a small printed circuit board (PCB) (e.g., a daughterboard) adapted for mechanical and electrical connection with a larger PCB (e.g., a motherboard) via a corresponding plurality of slot connectors.
Figure (FIG.) 1 shows a general arrangement of memory modules (MM0 through MMn) on a motherboard with a corresponding chipset. The term “chipset” is used throughout this description to denote a collection of conventionally understood clock, control, and/or drive circuitry. This circuitry may be implemented in a single integrated circuit package (i.e., a “chip”), or in a family of related chips. This circuitry may take many different forms. A chipset may include, for example, a memory controller, a bus re-drive circuit, a phase-lock or delay-lock loop circuit, power circuitry, and/or a clock generator or clock buffer circuit. At a minimum, however, the term chipset subsumes at least the circuitry providing an external clock signal (ECLK) to the memory modules and the various control/address (C/A) signals conventionally applicable to memory devices mounted on the memory modules.
In practical implementation, the external clock signal and potentially some or all of the control/address signals applied to the memory modules may be respective differential signals. The use of differential signaling within memory systems incorporating memory modules is well understood, and as such this particular design option will not be discussed in detail, nor will the drawings be cluttered with multiple differential signal illustrations. Rather, it is understood that the general descriptions of exemplary clock and control/address/data signals that follow may suggest the use of one or more differential signals, as the memory system designer deems appropriate.
In a similar vein, the generic use of control/address signals as well as data input/output (I/O) signals in conjunction with memory systems incorporating memory modules is also well understood. Those of ordinary skill in the art understand that the designation of address signals, address signal lines, control signals, control signal lines, data signals and/or data signal lines are separate matters of design choice. Address, control and/or data signals may be multiplexed on common signal lines and/or uniquely ascribed to one or more sets of dedicated signal lines. Data words, as defined by address signals and as provided by data signals may be defined by the system designer in any reasonable fashion.
Individual chip select (CS) signals are normally provided by the chipset to initiate or terminate one or more operations applied to a particular memory module. Each memory module typically comprises a plurality of memory devices (e.g., DRAM, SRAM, SDRAM, etc.). In early conventional implementations, DRAM-based memory modules used a stub-bus topology requiring that data signals from an associated memory controller be electrically connected to the data signal lines of every DRAM mounted on the memory module. This topology results in the use of very wide data buses.
However, as evolving server and PC designs have come to require greater and greater numbers of signal line connections and as the operating speed of memory systems incorporating memory modules has continued to increase, conventional connection architectures have proved unwieldy and the quality of data signals has become degraded due to problems with signal line impedance mismatching, noise, and reduced signal thresholds and swing voltages, etc. As a result, memory system designers have previously faced a choice between limiting memory density to reduce data errors at high operating speeds or accepting slower operating speeds to achieve high data density.
To avoid this difficult choice, memory system designers have developed alternative memory architectures to replace the stub-bus topology. A common approach used by many of these alternative memory architectures is to provide a register (also called a buffer) between each of memory module and the chipset (incorporating e.g., a memory controller). The register reduces electrical loads on the chipset to improve the integrity of associated data signals, and it also allows each memory module to send and receive data in a point-to-point fashion. In other words, the constituent register allows serial communication between the memory module and the chipset.
Serial communication may be realized, for example, through packet based transmissions between the chipset and the respective registers incorporated in memory modules within the memory system. Various conventionally understood routing or switching mechanisms and methods may be used to create, send, and/or receive the packets.
In addition to reducing distortion and noise on the data signals sent and received by the chipset, the registers also reduce the pin count (e.g., signal line connections) requirement for the memory system and allow higher transmission rates between the chipset and the respective memory modules.
FIG. 2 shows a memory architecture wherein each memory module includes a register adapted to send and receive at least a plurality of control/address (C/A) signals to/from a chipset and a plurality of memory modules MM0 through MMn. An external clock signal (ECLK) is provided by the chipset to each memory module, and data signals are transferred to/from the chipset and the plurality of memory modules via one or more data bus(es). Generally speaking, each memory module uses the control/address (C/A) signals to access its constituent memory devices in order to store (i.e., write) or retrieve (i.e., read) data.
A number of existing memory architectures incorporate registers into memory modules. These include Fully Buffered Dual In-line Memory Modules (FBDIMM), Registered Dual In-line Memory Modules (RDIMM), and Registered In-line Memory Modules (RIMM), to name but a few. Collectively, this class of memory architectures wherein at least one memory module comprises a register adapted to communicate with a chipset is generically referred to hereafter as “registered memories”. Similarly, a memory module comprising such a register will hereafter be generically referred to as a “registered memory module”.
FIG. 3 is a diagram showing a memory module 302 in a conventional registered memory system 300. Referring to FIG. 3, memory module 302 comprises a phase-lock loop circuit 303 receiving an external clock signal (ECLK) from a chipset 301, and generating first and second internal clock signals, (ICLK1 and ICLK2), in synchronization with the external clock signal. Of note, in certain embodiments, ICLK1 may comprise a plurality of internal clocks, each configured for respective application to one of a plurality of memory devices 305. Memory module 302 further comprises a register 304 receiving external control/address signals (EC/A) from chipset 301 and the second internal clock signal and producing an internal control/address signals (IC/A) in response to the second internal clock signal. Memory module 302 still further comprises the plurality of memory devices 305 receiving the internal control/address signals and first internal clock signal, and transferring data to/from chipset 301 via a data bus in accordance with the internal control/address signals and the first internal clock signal.
For memory module 302 to function properly, the external clock signal, first internal clock signal, and second internal clock signal should have the same phase when respectively sampled at phase-lock loop circuit 303, one or more of memory devices 305, and register 304 (e.g., at points P1, P2, and P3 shown in FIG. 3). Accordingly, a signal line L1 carrying the first internal clock signal and a signal line L2 carrying the internal clock signal should have substantially the same delay characteristics (i.e., should provide the same signal flight time).
The internal control/address signals should arrive at each one of the memory devices 305 at substantially the same time. In other words, any one of memory devices 305 should not receive the internal control/address signal before another one of the memory devices 305. One way to ensure that this happens is to arrange a signal line L3 connecting register 304 to memory devices 305 in an H-tree type of signal line topology. An exemplary H-tree topology is shown in FIG. 4. An H-tree topology comprises a hierarchical arrangement of signal lines allowing a signal to be distributed between a number of different destinations at roughly the same time. The H-tree topology also attempts to minimize transmission related distortion of the signal by balancing respective signal line impedances.
The internal control/address signals are periodically sampled by memory devices 305 at intervals determined by the first internal clock signal. In order to reliably sample the internal control/address signals, the setup and hold times of the internal control/address signals relative to the first internal clock signal should be large enough to ensure that internal control/address signals are stable when the sampling occurs. Here, the setup time is defined as an interval between a previous transition of the internal control/address signals and a sampling event, and the hold time is defined as an interval between the sampling event and a next transition of the internal control/address signals.
With the foregoing conventional examples in mind, the flow of control/address signals from chipset through register and on to individual memory devices will be considered in some additional detail.
Ideally, the setup and hold time periods for the internal control/address signals will be equal. That is, the setup time associated with the storing of the external control/address signals in the register and hold time during which the internal control/address signals are presented on a signal line bus connecting the register with the memory devices will be balanced and thus respectively equal to half the period of the external clock signal. This balanced relationship between setup and hold times allows the internal control/address signal to stabilize during half of the external clock cycle before being sampled by one or more of the memory devices. Where either the setup time or the hold time becomes too short relative to the frequency of the external clock signal, it becomes increasingly likely that the internal control/address signals will communicate erroneous (or unstable) control and/or address information to memory devices 305.
The stable maintenance of setup and hold time periods is further complicated by collateral performance expectations associated with contemporary memory systems. One such expectation is the provision of different operating frequencies within a memory system. In general, it is desirable that a memory system be able to operate at different frequencies so it may be used with processors or host devices operating at different frequencies.
Consider, as an example, the effect of changing the operating frequency on the control/address signaling dynamics of a typical conventional memory system. FIGS. 5 and 6 are waveform timing diagrams illustrating the operation of a conventional registered memory 300 at respective operating frequencies of 400 and 200 MHz. By comparing the waveforms shown in FIGS. 5 and 6, one may appreciate that effect that changing the operating frequency of a memory system has on the signaling dynamics of the constituent registered memory modules.
FIG. 5 illustrates operation of the exemplary memory system at 400 MHz, resulting in an external clock signal having a period of 2.5 ns. As the exemplary memory system is assumed to be optimized for 400 MHz operation, the resulting setup and hold times are 1.25 ns respectively.
Within FIG. 5, signal C/A_Rin shows the timing associated with the input of the external control/address (EC/A) signals to register 304. The external control/address signals are stored as signal C/A_Rin in register 304 on the rising edge of the second internal clock signal (ICLK2). Signal IC/A_Min shows the timing associated with the output of signal C/A_Rin from register 304 as internal control/address signals (IC/A) and the input of internal control/address signal to memory devices 305. Memory devices 305 sample signal IC/A_Min on the rising edge of the first internal clock signal (ICLK1).
Thus, a delay period (tpdf) is required for signal C/A_Rin to be output as the internal control/address signals and applied to memory devices 305. This delay period comprises one delay associated with the interval operation of register 304 (e.g., d_REG, or the time required to output the internal control/address signals after receiving signal C/A_Rin), and another delay associated with the signal flight time delay of signal line L3 (e.g., d_L3, or the signal flight time of the internal control/address signals from register 304 to memory devices 305).
As the exemplary memory system operation shown in FIG. 5 has been optimized to run at 400 MHz, the delay period is designed to be 1.25 ns, or half the period of the external clock signal. As a result, setup time “ts” and hold time “th” are also both equal to half a period of external clock signal. In the illustrated example, the delay period is set to 1.25 ns by setting the register delay (d_REG) of register 304 to 0.5 ns and setting the flight line delay of signal line L3 (d_L3) to 0.75 ns. Alternatively, register delay (d_REG) could be set to 0.75 ns and flight time L3 line delay (d_L3) could be set to 0.5 ns. Either way, delay period “tpdf” may be accurately established at the desired time of one half the period of the external clock (e.g., 1.25 ns).
However, the operation of the exemplary system illustrated in FIG. 5 is next assumed to undergo a change in operating frequency from 400 MHz to 200 MHz. The results of this change are illustrated in FIG. 6. As can be seen in FIG. 6, the established delay period (tpdf) remains set at 1.25 ns. As a result, the setup time (ts) becomes 3.75 ns and hold time (th) remains 1.25 ns. Since hold time is very short relative to the operating frequency of the memory system, as defined by the external clock signal, it is possible that one or more of the memory devices 305 will fail to properly receive the internal control/address signals output from register 304.
One potential solution to this problem is to adjust delay period such that the setup time and hold time become balanced at half the period of the external clock signal. Because the delay period is defined by the sum of the register delay and flight time line delay of L3, it may be changed by modifying either of these two values.
However, the register delay is an intrinsic property of register 304 that can only be changed by physically modifying or reconfiguring register 304. Hence, it is not possible to dynamically change register delay to enable registered memory system 300 to operate at a different frequency. A change in the register delay would necessarily involve the manufacturer of register 304. This proves in practical circumstances to be very difficult since a memory module manufacturer may purchase multiple different registers from different register manufacturers.
The flight time line delay on the other hand may be dynamically changed to adjust delay period. Two conventional approaches to the adjustment of delay period by changing the flight time line delay are disclosed in published U.S. Patent Application No. 2003/0221044 (the '044 application) and U.S. Pat. No. 6,754,112 (the '112 patent). In both the '044 application and the '112 patent, register 304 includes a delay circuit (e.g., a delay-lock loop) that may be reprogrammed to adjust the flight time line delay when the operating frequency of registered memory 300 is changed.
Unfortunately, the inclusion of a reprogrammable delay circuit within register 304 creates at least two problems. First, it increases the manufacturing cost and potentially the size of register 304. Second, it adds another step in the programming process required to change the operating frequency of the constituent registered memory system. This additional step is undesirable because operating frequency selection for a memory system should be as seamless and transparent to the host system as possible.