Not applicable.
Not applicable.
1. Field of the Invention
The present invention generally relates to memory systems that include high speed memory devices. More particularly, the present invention relates to memory systems, such as Direct Rambus Dynamic Random Access Memory (RDRAM), that require calibration cycles to insure proper operation. Still more particularly, the present invention relates to a memory system that includes error checking and correction logic, and which adjusts the frequency of calibration cycles based on the number of memory errors detected by the error correction and checking logic.
2. Background of the Invention
Almost all computer systems include a processor and a system memory. The system memory functions as the working memory of the computer system, where data is stored that has been or will be used by the processor and other system components. The system memory typically includes banks of dynamic random access memory (DRAM) circuits. According to normal convention, a memory controller interfaces the processor to a memory bus that connects electrically to the DRAM circuits. While DRAM circuits have become increasingly faster, the speed of memory systems typically lags behind the speed of the processor. Because of the large quantity of data that is stored in the system memory, it may at times be a bottleneck that slows down the performance of the computer system. Because of this disparity in speed, in most computer systems the processor must wait for data to be stored (xe2x80x9cwrittenxe2x80x9d) and retrieved (xe2x80x9creadxe2x80x9d) from DRAM memory. The more wait states that a processor encounters, the slower the performance of the computer system.
The main memory provides storage for a large number of instructions and/or a large amount of data for use by the processor, providing faster access to the instructions and/or data than would otherwise be achieved if the processor were forced to retrieve data from a disk or drive. However, the access times of conventional RAMs are significantly longer than the clock cycle period of modem processors. To minimize the latency of the system, various high-speed memory devices have been introduced to the market. An example of such a high-speed memory device is the Direct RDRAM device developed by Rambus. See xe2x80x9cRAMBUS Preliminary Information Direct RDRAM(trademark)xe2x80x9d, Document DL0060 Version 1.01; xe2x80x9cDirect Rambus(trademark) RIMM(trademark) Module Specification Version 1.0xe2x80x9d, Document SL-0006-100; xe2x80x9cRambus(copyright) RIMM(trademark) Module (with 128/144 Mb RDRAMs)xe2x80x9d Document DL00084, Version 1.1, which are incorporated by reference herein. As indicated in the Rambus specifications, the Direct RDRAM memory is capable of transferring 1.6 GB per second per DRAM device.
Each Direct RDRAM device typically includes 32 banks, with 512 rows per bank, although other size RDRAM devices may be available. Depending on the size of the RDRAM device, each row (or page) typically has either 1 kilobyte or 2 kilobytes of memory storage capability. The Direct RDRAM devices are arranged in channels, with each channel currently capable of supporting up to 16 Direct RDRAM devices. One or more Direct RDRAM devices may be packaged in Rambus In-line Memory Modules (RJMMs). Multiple channels may be provided in a computer system to expand the memory capabilities of the system.
While Direct RDRAM and similar memory devices are theoretically capable of operating at very high speeds, they exhibit certain severe operating constraints that can significantly degrade performance. To achieve the high operational speeds, the memory devices have very precise timing requirements, with very little margin or tolerance for deviation. Parameters for read transactions will be discussed briefly to illustrate some of the timing issues.
As shown in FIG. 1, the Direct RDRAM couples to a memory controller (which includes a Rambus ASIC Cell or xe2x80x9cRACxe2x80x9d) via two clock signal lines, three Row signal lines, five Column signal lines, and two data busses. The clock lines include a Clock-to-Master (CTM) line, and a Clock-from-Master (CFM) line that are used to synchronize signals to the memory controller and from the memory controller, respectively. The Row signal lines and Column signal lines form part of a control and address bus (RQ bus) that typically includes eight lines. The Row signal lines (ROW2 . . . ROW0) are used primarily to control row accesses in the memory, while the Column signal lines (COL4 . . . COL0) are used primarily to control column accesses. The data busses include a DQA (DQA8 . . . DQ0) and a DQB data bus (DQB8 . . . DQO), that couple to sense amps on opposite sides of the memory banks.
The three Row lines identify which of the 512 possible rows is addressed by presenting nine row bits (R8 . . . R0) in three subsequent half clock cycles (29=512), as shown in FIG. 2. The device row (DR) bits (DR3 . . . DR0) identify which of the 16 possible memory devices is targeted, while the five Bank row (BR) bits (BR4 . . . BR0) identify which of the 32 banks is targeted in that device. Similarly, and as shown in FIG. 3, the five Column lines identify which of the 64 possible columns is being addressed by presenting 7 column bits (C6 . . . C0) in two subsequent half cycles. The device column (DC) bits (DC4 . . . DC0) identify which of the memory devices is targeted, while the five Bank column (BC) bits (BC4 . . . BC0) identify which of the 32 banks is targeted.
Referring to FIG. 4, a read transaction is performed on a Direct RDRAM device by asserting an Activate command in a ROWA (row activate) packet on the Row signal lines. The Activate command identifies the device, bank and row address of the targeted memory location. A time period tRCD later, a Read command is issued in a Column operation (COLC) packet on the Column signal lines. The Read command identifies the device, bank, and column address of the targeted memory location. Thus, the Activate command and Read command in conjunction identify the specific memory location being accessed, with the Activate command identifying the row, and the Read command identifying the column.
A time period tCAC after the Read command, a read data dualoct (16 bytes) is returned by the targeted memory device. The time period tCAC includes one to five cycles of round-trip propagation delay on the channel. According to current Rambus specifications, the tCAC period may be programmed to a range of values that vary from 7 tCYCLE to 12 tCYCLE. The particular value selected for tCAC depends on the number of RDRAM devices on the channel and the RDRAM timing bin so that the round trip propagation delay is equalized for all memory devices. Thus, based on the programmed timing parameters, the memory controller expects that during read cycles, all memory devices will return read data within a specified number of clock cycles after the Read command is asserted. Failure to return data in accordance with these timing parameters will cause data corruption, and may result in failure of the memory system.
The above timing parameters for a read transaction is just one example of the critical nature of timing in a high speed memory device, where the delay of a few nanoseconds can result in poor performance. Unfortunately, high-speed memory devices such as Direct RDRAM have proven highly susceptible to temperature and other environmental conditions such as humidity. If such conditions change during operation, the round-trip propagation delay of the signals propagating between the memory controller and the memory devices will be affected. If the actual propagation delay varies from the programmed delay, data may be corrupted.
In an attempt to resolve operational problems with high speed memory devices such as RDRAM, the memory controller may be designed or programmed to perform certain calibration cycles on a periodic basis. Thus, for example, memory controllers used with Direct RDRAM memory device perform current and temperature calibrations on a periodic basis. For current calibrations, a current calibration cycle is performed to every DRDAM device once every tCCTRL interval to maintain the IOL current output within its proper range. As shown in the example of FIG. 5, four Column extended operation (COLX) packets are asserted by the memory controller with a Calibrate (CAL) command. These Calibrate commands cause the RDRAM to drive four calibration packets Q(a0) a time period tCAC after the CAL command on the DQA4 . . . 3 and DQB4 . . . 3 wires. In addition, the TSQ bit of the INIT register is driven on the DQA5 wire during the same interval as the calibration packets. The TSQ bit indicates when a temperature trip point has been exceeded, as measured by temperature sensing circuitry. The last COLX packet from the memory controller includes a SAM command, concatenated with the last CAL command, that causes the RDRAM to sample the last calibration packet and adjust its IOL current value.
The Calibrate command must be sent on an individual basis to each RDRAM device so that calibration packets from other devices do not interfere with the calibration. Consequently, a current control transaction must be transmitted every tCCTRL/N period, where N represents the number of RDRAMs resident on the channel. After each current calibration transaction, the device field Da of the address a0 in the Calibrate command is incremented.
Temperature calibration similarly is conducted on a periodic basis. As shown in FIG. 6, the temperature calibration sequence is broadcast once every tTEMP interval to all the RDRAMs on the channel. The TCEN and TCAL are row opcode field commands in a ROW operation packet. These commands cause the slew rate of the output drivers to adjust for temperature drift. During the quiet interval, tTCQUIET, the devices being calibrated cannot be read, but can receive write transactions.
Thus, while Direct RDRAM is designed to calibrate memory devices based on current and temperature calibrations, these calibrations are performed on a rigid schedule to meet certain minimum timing requirements. In addition, these calibration cycles require long periods of idle time, during which no read cycle is permitted to the memory devices being calibrated. This idle time can add significant latency to any queued read cycles. Currently, the idle time for a Direct RDRAM temperature calibration cycle (the period defined for tTCQUIET) is a minimum of 350 ns for a 800 MHz memory device (which is 140 clock cycles).
Because of the sensitively of the high-speed memory devices, many memory controllers are implementing error checking and correction (ECC) logic to improve the reliability of the high-speed memory devices. The ECC logic may be used to monitor the number of soft and hard memory errors that occur during memory operations. In some instances, computer manufacturers may offer warranties which require the manufacturer to replace a memory module when the number of errors exceeds a predefined threshold. Because of the sensitivity of the high-speed memory modules, the number of memory errors may be due to clock skew errors that result from environmental conditions. Thus, memory modules may be replaced unnecessarily because of excessive errors that result from changing environmental conditions. In addition, the high occurrence of memory errors may be perceived by customers as indicating that the computer system is unreliable, due primarily to the sensitivity of the high-speed memory modules.
It would be desirable if a system could be developed that would provide greater flexibility in modifying timing parameters of memory components based on the occurrence of memory errors. It would also be advantageous if the memory controller was capable of making intelligent decisions regarding memory operating conditions based on the occurrence of memory errors. Despite the apparent advantages such a system would offer, to date no such system is available.
The present invention solves the deficiencies of the prior art by implementing an intelligent memory controller that monitors the number of memory errors detected by error checking and correction (ECC) logic. According to the preferred embodiment, the memory controller preferably includes the capability of adapting the operation of the memory system in response to the occurrence of an excessive number of soft memory errors by changing the frequency of the calibration cycles. If the number and/or frequency of soft memory errors is high, the memory controller increases the calibration frequency to minimize the number and frequency of memory errors. If conversely, the number and frequency of memory errors is very low, the memory controller may decrease the frequency of the calibration cycles to minimize idle time while the calibration cycles are performed.
According to an exemplary embodiment of the present invention, the memory system includes a memory controller that receives signals from ECC logic that indicated the number of memory errors that occur during a defined period. The memory controller uses these input signals from the ECC logic to dynamically determine calibration frequencies, thus adapting quickly to any changes in the environment. If the number of memory errors is low, calibration may not be necessary, and therefore may be deferred to improve memory performance. Conversely, if the number of memory errors is high, a calibration may be scheduled. The calibration periods also may be varied depending on the number of errors over a particular time period.
In the event that multiple memory channels are implemented in the computer system, ECC logic may be provided with each channel to detect the number and frequency of memory errors. Each of the ECC logic devices provide an output signal to a control logic that indicates the number of error detected by each EEC logic device. The control logic monitors these signals from the ECC devices, and modifies the calibration frequency for each memory channel, as necessary, to maintain the number of memory errors within an acceptable level.