1. Field of the Invention
This invention is related to providing a high speed bus for a memory system, and more specifically provides a memory system for high availability servers where the speed of transactions on the bus is increased by reducing the effective capacitance of the bus and where high availability features are enabled by the improved isolation between memory modules.
2. Description of the Related Art
As computers and their central processing units (xe2x80x9cCPUsxe2x80x9d) become capable of executing instructions more rapidly, there is a concurrent need for increased processing speed of memory instructions. In performing a typical data read operation of a memory device, a memory controller (usually the CPU or, in larger systems, a dedicated memory controller) sends a read command to a particular memory chip. This command is propagated to the chip along one or more lines of a command bus. When received by the particular chip, the command causes the chip to locate and direct an output from its internal memory array onto a data bus, as a return data signal intended for the memory controller. The output then propagates along the data bus, which may or may not travel the same route as the command bus. In the example just given, there are three sources of time delay, including the propagation time of a read command from the controller to the chip, the time required for the chip to power its internal registers and to channel the proper output onto the data bus, and the time required for propagation of the output back to the controller.
Similarly, in performing a typical data write operation to a memory device, the memory controller sends a write command to a particular memory chip along with the data to be written. This command is propagated to the chip along one or more lines of a command bus, while the data is propagated to the chip along one or more line of a data bus. When received by the particular chip, the command causes the chip to channel the data from the data bus to the specified location of its internal memory array. The data propagating along the data bus may or may not travel the same route as the command propagating along the command bus. In the example just given, there are three sources of time delay, including the propagation time of a write command from the controller to the chip, the time required for propagation of the data from the controller, and the time required for the chip to power its internal registers and to channel the data from the data bus.
Typically, design efforts have focused primarily on improving internal routing and processing of instructions within memory chips. These design efforts, however, while continually providing more responsive memory devices, do not address the primary cause of propagation delay along the data bus, the inherent capacitance of the data bus. As a result, many systems are sending data over the data bus at rates far lower than the operating speeds of the CPUs.
The problem of inherent capacitance of the data bus is further explained with reference to FIGS. 1A and 1B. FIGS. 1A and 1B illustrates a data path within a memory system 100. The memory system 100 shown is configured for either a SDR (Single Data Rate) or DDR (Double Data Rate) SDRAM memory system. The data path includes a memory controller 102, a motherboard 103, memory chips 104, memory modules 105, and a data bus 106. The data bus 106 includes board trace portions 107, module trace portions 108, connectors 109, and termination 110.
The memory controller is affixed to the motherboard and is electrically connected to the memory chips via the data bus such that the memory modules are connected in parallel. The memory chips are affixed to the memory modules. The board trace portion of the data bus is affixed to the motherboard and the module trace portion of the data bus is affixed to the memory modules. The connectors 109 electrically connect the board trace portions to the module trace portions and mechanically affix the memory modules to the motherboard.
FIG. 1B depicts the electrical equivalent 111 of the data path shown in FIG. 1A. For ease of reference, each electrical equivalent in FIG. 1B that represents a component shown in FIG. 1A is labeled with the reference numeral of the represented component with the suffix xe2x80x9cAxe2x80x9d. It should be noted that the board trace portion 107A is made up of inductive and capacitive elements which together behave as a transmission line 112 having a set of impedance and transmission delay characteristics. Similarly, each of the module trace portions 108A are made up of inductive and capacitive elements which together behave as transmission lines 113, each having its own set of impedance and transmission delay characteristics.
When properly terminated with a resistor 110A, the board trace portion 107A acts as a nearly perfect transmission line (not shown) without inherent capacitance and will not in and of itself limit the operating speed of the memory system. When combined with the module trace portions 108A, however, the module trace portions 113 act as transmission line stubs coming off of the board trace portion 107A. These stubs together have a xe2x80x9ccomb filterxe2x80x9d effect that includes significant signal reflections in the memory system that decreases signal integrity. This xe2x80x9ccomb filterxe2x80x9d effect imposes a load on the data bus and effectively breaks the board trace portion 107A into individual board trace portion transmission lines 113.
The load imposed by the xe2x80x9ccomb filterxe2x80x9d effect limits the maximum transmission speed of data propagation in both the board trace portion 107A and the module trace portions 108A. The xe2x80x9ccomb filterxe2x80x9d effect imposed by the stubs generally increases as the length of each the module trace portions 108A increases. Similarly, the xe2x80x9ccomb filterxe2x80x9d effect imposed by the stubs generally decreases as the length of each of the module trace portions 108A decreases. A second cause of the propagation delays for data signals sent from the memory controller 102A to the memory chips 104A are the inductive element 114 and capacitive element 115 associated with each memory chip. Together, the inductive and capacitive elements impose a capacitive load on the data bus including both the module trace portions 108A and the board trace portion 107A. The load imposed by the xe2x80x9ccomb filterxe2x80x9d effect and the capacitive load imposed by the memory chip elements together form the inherent distributed capacitance load on the memory bus.
Another common memory configuration for computer memory systems is the RAMBUS memory configuration. FIG. 2 shows a schematic diagram illustrating the electrical equivalent of a the data path of a conventional RAMBUS memory system. The data path includes a memory controller 202, memory modules 205, and data bus 206. The data bus includes board trace portions 207, module trace portions 208, connectors 209, and termination resistors 210. Unlike the memory configuration shown in FIGS. 1A and 1B where the memory modules are connected in parallel, in the RAMBUS configuration shown in FIG. 2, the memory modules are connected in series. In addition, the connector inductive element 209 occurs at twice as often as the equivalent memory configuration shown in FIGS. 1A and 1B that has the same number of memory modules.
The board trace portion 207 is made of inductive and capacitive elements which together behave as a transmission line having a set of impedance and transmission delay characteristics. Similarly, each of the module trace portions 208 are made up of inductive and capacitive elements which together behave as transmission lines 213, each having its own set of impedance and transmission delay characteristics. When combined with the module trace portions 208, however, the module trace portions 213 act as transmission line stubs coming off of the board trace portion 207 decreasing signal speed and integrity.
Compared to the configuration shown in FIG. 2, the configuration shown in FIGS. 1A and 1B reduces the loading effects on the data bus due to the board trace portion. However, because the effective loading on the data bus due to the module trace portions 213 is increased in the configuration shown in FIG. 2, the bus impedance that is not typically reduced. In fact, because the memory modules in the RAMBUS configuration are connected in series instead of in parallel, the effective loading on the data bus is typically increased substantially compared to the configuration shown in FIGS. 1A and 1B.
Typically the parallel configuration shown in FIGS. 1A and 1B is preferred to the RAMBUS configuration shown in FIG. 2, in part because of the lower comparative capacitive loading on the memory data bus. However, there are other problems with the RAMBUS configuration. One major problem is the lack of effective DIMM isolation. With increases in the number of DIMM modules connected to the data bus, the probability of DIMM failure increases. While the parallel configuration shown in FIG. 2 provides some DIMM isolation, the serial nature of the RAMBUS configuration effectively provides no DIMM isolation. Thus, if a single DIMM module fails with a stuck output bit for example, the entire RAMBUS memory system fails. Similarly, a connector failure in the serial RAMBUS configuration will result in failure of the memory system. Further, if a RAMBUS module is removed it causes a bus disconnection. Because of the potential failures, the RAMBUS configuration is not a preferred choice for the high availability systems that are becoming the increasingly popular business customer choice.
Other memory configurations attempt to solve the problem of inherent capacitance in the memory bus in several ways. One solution is to provide series resistors on the module trace portion of the data bus in order to electrically separate the module trace portion from the board trace portion of the bus. This technique has been successfully used for frequencies of up to 66 MHZ, but has not been very successful at higher frequencies. Another solution is to provide FET switches on the mother board that break the data bus into sections. For example, a switch multiplexor has been used to separate a set of four memory modules into two electrically independent groups of two modules. This approach creates two smaller memory buses, each presenting less inherent capacitance that the original larger bus. Each of these smaller buses however, still have inherent capacitance load on the data bus and the switch itself adds a capacitive load and thus have limited signal propagation speed.
Another solution to the problem of the inherent capacitance in the memory bus is shown and described with reference to FIGS. 3A and 3B. FIG. 3A is a side view of a switch controlled memory module configuration described in a related patent application having the title xe2x80x9cCapacitance Reducing Memory System Device, and Methodxe2x80x9d having Ser. No. 08/960,940 and filing date of Oct. 30, 1997. FIG. 3B is a schematic diagram illustrating the electrical equivalent of the switch controlled memory module configuration shown in FIG. 3A. For clarity, the electrical equivalents of items shown in FIG. 3B are marked with the same reference numerals as the items in FIG. 3A with an added xe2x80x9cAxe2x80x9d suffix.
Referring to FIG. 3A, the memory devices 322 and switches 329 are preferably affixed to removable memory modules 324 that allow the memory system configuration to be easily changed by simply adding modules or by replacing some or all of the modules. Each of the memory modules 324 are mechanically affixed to a main board 325 by a connector 326. The connector provides all the electrical connections between the memory controller and the memory devices. The electrical connections include interconnects between the portion of the data bus on the main board 327 and the portion of the data bus on the module 328.
Referring to FIG. 3B, when a switch 329A is in an open position (terminals 335 and 336 electrically decoupled) the memory device 322A associated with the open switch is decoupled from the data bus as is the portion of the data bus between the switch and the memory device. This means that no data can be sent or received by the memory device, or memory devices, that have been electrically decoupled from the data bus. It also means that the portion of the data bus between the switch and the memory device is decoupled from the data bus and does not add to the stub length of module portion 328A. Further, the capacitive load of the memory devices 322A which have been decoupled from the data bus as a result of the switch being open will no longer contribute to the overall capacitive load on the data bus (as seen by the memory controller and any coupled memory devices) thus increasing the data transfer speed between the memory controller and the coupled memory device.
The board portion 327A includes a series of transmissions lines 333. The module portions 328A each include a transmission line 334 that forms a transmission line stub coming off of board portion 327A. Each stub thus formed, creates a xe2x80x9ccomb filterxe2x80x9d effect on the data bus that places a load on the data bus including board portion 327A and module portion 328A. This xe2x80x9ccomb filterxe2x80x9d effect created load is usually proportional both the number of module portions 328A attached to the board portion 327A and to the length of each of the module portions 328A. Compared to the memory configurations shown in FIGS. 1A, 1B and 2, the memory configuration shown in FIGS. 3A and 3B with a FET switch on each memory module helps decrease the capacitive loading due to the memory modules by eliminating the capacitive loading of the memory devices that are decoupled or not electrically coupled to the data bus. This helps to reduce the comb filter effect, thereby increasing the data transfer speed of the data bus.
Although the configuration shown in FIGS. 3A and 3B does improve isolation compared to the embodiment shown n FIG. 2, it still does not provide adequate isolation to prevent system failure in the event of a connector failure. In addition, the embodiment shown in FIGS. 3A and 3B do not provide the ability to do simultaneous write to two or more memory modules. Both of these features are desirable in high availability computer systems.
However, a problem with the embodiment shown in FIGS. 3A and 3B is that although compared to the embodiment shown in FIGS. 1A and 1B, it reduces the comb filter effect, the trace to the active DRAM in the embodiment shown in FIGS. 3A and 3B still forms a stub on the data bus. A further problem with the embodiment shown in FIGS. 3A and 3B is that it does not significantly reduce the capacitive loading due to the board portions 327A of the memory system. Often times it is this loading 327A that becomes the most significant factor for signal delay and not the capacitive loading of the memory devices, which is significantly reduced by the configuration of FIGS. 3A and 3B. The capacitive loading due to the board 327A becomes especially problematic as the number of boards and DIMMs in the computer system is increased. Because customer demand for increased memory in the high end servers has increased, one sometimes sees the anomalous behavior of a large or high end server having lower r memory system performance than a small or low end server. This decreased performance is often compensated by adding large CPU caches. However, the addition of large CPU caches can significantly increase system cost.
A memory system that significantly minimizes the capacitive loading due to board traces while still minimizing the effects of capacitive loading to the memory devices, which provides high availability features, and provides improved isolation is needed.
The present invention provides a memory configuration that minimizes the capacitive loading due to board traces while still minimizing the effects of capacitive loading due to the memory devices, provides high availability features, and provides improved isolation. The memory configuration includes a memory controller and a single central 1:N switch that is used to connect the memory controller to the N memory modules in a manner that reduces the capacitive loading on the data bus. The memory configuration includes a memory controller, a central switch, a data bus that is electrically coupled to the memory controller and the central switch, and a plurality of N memory modules, where each of the plurality of N memory modules is electrically connected radially to the central switch means by a separate memory module bus.
In the preferred memory configuration, the central switch acts as a switch to all of the N memory modules. The central switch is physically located on the motherboard preferably central to and between the memory modules. The memory configuration effectively results in a point to point bus between the memory controller and the memory device on the memory module. The memory configuration essentially eliminates the impedances due to the board traces between the memory modules that are not electrically connected to the data bus. The elimination of these intermodule impedances from the configuration means that the board trace to the active DRAM of the memory module does not form a stub on the bus eliminating the comb filter effect.
Because the capacitive loading effects of the board traces between memory modules is effectively eliminated, multiple memory modules can be added to the system without decreasing the speed of the data bus. This is especially critical with today""s increasing memory needs. Thus according to the configuration of the present invention, the system memory size can be substantially increased without decreasing the data bus speed.
The point to point bus provided by the memory configuration makes the memory system design easier since transmission line reflections are significantly reduced. Further, the xe2x80x9cstublessxe2x80x9d point to point arms of this memory configuration allows much higher data transfer rates because capacitive loading is significantly reduced. Because the comb filter effect is eliminated, the system can be run close to the maximum frequency of the memory module. Thus, the point to point bus configuration allows clocking at xe2x80x9cclose toxe2x80x9d the core DDR frequency. Alternatively, if an SDR frequency is used, the margins could be increased significantly with a fixed speed.
The central 1:N switch preferably includes a data selection circuit and a decoding circuit. The data selection circuit in combination with the data selection circuit help choose which memory modules will be accessed. The data selection circuitry preferably includes a transceiver circuit which determines the direction of data flow and selects which memory channel is active.
The memory configuration provides improved isolation. Typically, a problem on one of the modules or with a module connector will not effect the main data bus. For example, for the case where two connector pins short together, although the memory module with the failed connector will not be available to the memory system, the remaining memory modules will remain unaffected by the connector short. Further, the memory configuration of the present invention helps to isolate the memory modules from each other, so that with the appropriate support, the memory boards can be easily hot swapped since a module inserted on one arm of the central switch will not disturb the other arm, thus adding to the high availability features of the memory system. Hot swapping is feasible in part, because a higher level of redundancy is possible since each module is located on an independent arm or channel of the data bus.
In addition, the memory configuration allows flexibility in implementing redundant memory schemes. The isolation provided by the memory module configuration allows for simultaneous memory operations (read, write) that are not possible in serial or parallel buses without a substantial performance impact. The memory configuration allows the system to write identical data simultaneously to more than one memory module at a time. This is useful when memory mirroring, when rebuilding data on a xe2x80x9csparexe2x80x9d module in a redundant system, or alternatively when performing memory initialization (such as ECC memory initialization) where identical data is written to multiple memory channels simultaneously. Also, for higher performance applications, if registers are used in the central switch one can read from more than one memory module at a time to the central switch, and also operate at a higher clock frequency since the controller to the memory chip delay is cut into two. The CMOS SSTL-2 central switch also restores signal levels for data bus signals that pass through it.
A further understanding of the nature and advantages of the present invention may be realized with reference to the remaining portions of the specification and the attached drawings.