This invention generally relates to high speed memory subsystems, and more specifically, to high speed memory subsystems that operate with a wide range of memory loads.
Current memory subsystems operate address and command busses in a manner that is synchronous to a fixed system clock with all memory locations receiving the same clock (possibly with a fixed offset) independent of the memory installed at that location. In pluggable memory systems, such as those using memory modules such as xe2x80x98DIMMsxe2x80x99 (Dual Inline Memory Modules) , the actual number of memory devices in a particular pluggable position might range from as few as two to as many as 18 or more devices (if using Unbuffered DIMMs with xc3x9732, xc3x9716 or xc3x978 devices). Since the load is varying, but the clock arrival is fixed, it becomes increasingly difficult to ensure an adequate window of time, at high speeds, for when the addresses must be valid if they are to be captured by the clock.
Various methods are currently used to address this problem.
One method (1) is to:
Operate the address bus at 2xc3x97 the clock period (ensure a new valid address every 2nd clock, rather than every clock cycle). This is the most widely adopted solution today, and decreases system performance while minimizing the benefit of emerging high speed memory devices.
Another solution (2) is:
the use of xe2x80x98Registeredxe2x80x99 DIMMs: On these DIMMs, the addresses are re-driven a cycle later, from a driver on the memory module. This also reduces overall system performance (adds one clock of latency), but does permit a new address every clock (since the overall loading on the system bus is reduced). This solution is currently used in high-end systems, and is being planned by most low-end PC""s since they know of no other viable option.
Another method (3) for addressing the above-discussed problem is the use of registers on the system board: This is the same solution as described in (2), but with the space and cost for the registers incurred by the system-rather than by the memory module supplier (and end-customer). As such it is rarely desirable.
Another method (4) for addressing the problem is the use of delayed clocks. Currently, systems may set the memory device clocks with a fixed delay to either slightly lead or lag the system clock to improve general timings. It would be theoretically possible to add logic complexity to the system to further gate these clocks based on the memory loading, but this would be very difficult to accomplish without the addition of significant skew or jitter error, and it would not be possible to independently adjust the clock arrival time in reference to specific signals on the DRAM (only one clock input per DRAM). Hence this solution is not viable, and has not been applied.
Still another method (5) for addressing the above-described problem is to use programmable drive strength drivers: This feature is becoming widely used as a method of matching the drive strength to the load. The predominant purpose for this feature is to ensure the fastest possible transition times and propagation delays in a given application, without violating overshoot, undershoot, or setup and hold transition times of the receiver. This method can also be used, to a limited extent, to help close timings on signals from the memory interface to the memory devices. Unfortunately, the driver sizing is limited in regard to the current available to the driver, the simultaneous switching effects in the driver package, the overshoot and undershoot specs on the receiver, etc. In addition, maximizing drive strength has only a limited benefit in reducing a signal delay. At some point, the driver must be switched earlier in order to gain further performance.
An object of this invention is to provide a high speed memory subsystem that operates with a wide range of memory loads.
Another object of the present invention is to provide a memory interface device (memory controller/chipset or signal re-drive device) that is programmable to have different clock-to-output delays, on signals from the memory controller end, based on the memory installed in the system at time of power-up.
These and other objectives are attained with a data processing system, and a method of operating a data processing system, comprising a clock generator for generating a system clock signal, and a memory unit having a plurality of memory modules for storing data. The data processing system further comprises a memory controller coupled to the clock generator for receiving the system clock signal therefrom, and coupled to the memory modules for outputting memory clock signals to said modules. The memory controller is programmable to have different clock-to-output delays, on signals from the memory controller end, based on the memory installed in the system.
Preferably, the memory controller includes means for generating a series of memory clock signals in response to receiving the system clock signal, and for outputting the memory clock signals to the memory modules. This preferred memory controller also includes programmable means for determining time delays between the time the memory controller receives the system clock signal and the time the memory controller outputs the memory clock signals. With this preferred embodiment, the memory controller is programmed by computing the load of each of the memory modules. More specifically, the memory subsystem has a plurality of slots for receiving the memory modules, and the load of the memory modules is computed by accessing each memory slot to determine if memory is plugged in, the type of memory installed, and to determine the number of modules the controller needs to drive. Also, for example, the memory modules may be pluggable dual inline memory modules.