The inexorable increase of CPU speed during the past two decades has pushed memory systems into faster and wider implementations. It is clear that increased CPU clocking speed alone cannot provide for quicker software execution times. Memory systems should be designed to deliver data to the CPUs at their native rates or risk forfeiting the benefit of the CPU's increased performance capabilities.
There are two fundamental system approaches to increasing memory system performance. First, increasing the internal access speed of memories has been an ongoing activity for the past decade, particularly in regard to dynamic random access memories (DRAMs), the memory technology most often used to implement the main operating storage in computers and other consumer electronics devices. Second, increasing the width of the data bus to the memory system also provides extra bandwidth.
In today's environment, both approaches of increasing DRAM speed and bus widths are being implemented. The Double-Data-Rate 2 (DDR-2) specification, for example, is a follow-on specification to DDR-1 and provides for data delivery rates up to 800 megabits per second (Mb/s) per pin. DDR-2 is projected to double the performance of memory systems and imposes modest modifications to both the DRAMs and the DRAM controllers. Alternatively, some system designers have resorted to dual memory controllers to double the bandwidth, as in the prior-art system of FIG. 1. With multiple controllers, each can operate independently thereby providing additional bandwidth. However, the use of multiple controllers, while allowing for higher bandwidth, also requires many more I/O signals and PCB board space.
FIG. 2 illustrates a prior-art multi-drop memory system that is prevalent in modern processing systems In essence, memory elements are tied into a common bus which terminates at a memory controller. The signal routing allows for stubs, or signal paths which are tapped off from the main signal path. These stubs make it very convenient to design and implement removable memory modules such as single inline memory modules (SIMMs) or dual inline memory modules (DIMMs). However these stubs also create signal transmission problems especially at higher frequencies. In fact, as frequencies have progressed into the multiple of 100s of Megahertz, the signal degradations become very pronounced due to these stubs.
In modern multi-drop memory systems, the length of stubs are reduced to minimal proportions and enhanced I/O (input/output) electronics are provided in both the controller and the memory elements to achieve higher signal frequencies. FIG. 3, for example, illustrates a prior-art memory system that includes sophisticated timing and control circuitry in both the memory controller and DIMM-mounted memory devices. As shown, a CPU 20, connects to a memory controller 22 via a front side bus 21 (i.e., having address and data paths as shown). The memory controller 22 contains a data channel, an address decoder and multiplexer as well as generators for clocks and memory timing. The resulting memory interface signals 23, 24 and 25 connect to DRAM chips on DIMM modules 26 via electrical paths typically routed through printed circuit board traces, DIMM sockets and DIMM PCB substrates. In this implementation, signals 23, 24, 25 generated/received by the memory controller 22 are directly connected to individual memory chips. In order to boost signaling rates and thereby achieve higher memory bandwidth, relatively complex timing circuits (e.g., delay locked loops) are typically provided in both the memory controller and each of the memory chips to recover timing information from source-synchronous strobe signals (e.g., BYTE Strobe). Even with such timing circuitry and the cost penalty they impose, multiple instances of the memory controller and DIMMs are often required to satisfy bandwidth requirements of modern data processing applications.
FIG. 4 illustrates a view of a prior-art memory system showing a memory controller, a channel and a memory element. For most systems utilizing a memory system, such as a desktop computer, the distance between the controller and the memory is kept to a minimum thereby allowing for the least amount of signal distortion on the signal channel. For a typical desktop computer, this distance ranges between 6 and 8 inches. For earlier computer systems (circa the early 1990s), where the frequency of signals was less than 100 MHz, primitive signal path structures (vias, through-hole connectors and single ended transmission) did not seriously degrade the communication between the controller and memory. The memory controller and memory elements could utilized straight-forward and simple I/O drivers. As semiconductor improvements became available, it was possible to increase both the density and speed of both the controller and memory elements, shifting the performance bottleneck to the interconnecting channel. That is, as illustrated in FIG. 5, the physical channel that allowed signal transmission with simple I/O in the hundred megahertz range exhibits relatively poor high-frequency response (e.g., due to the more pronounced effects of capacitance, inductance, loss, impedance mismatch, etc.) and therefore became inadequate as on-chip frequencies entered the gigahertz range. Consequently, taking advantage of the added transistors made available by shrinking process technologies, engineers designed more sophisticated I/O drivers and receivers. These I/O cells, in the form of SERDES (SERializers/DESerializers), Clock Data Recovery (CDRs) circuits, pre-emphasizers, encoders, deskewers, and so forth have made it possible to push the speed of signaling up into the Gigahertz range and still utilize conventional channel structures. Unfortunately, such sophisticated I/O cells add significant design and manufacturing expense and therefore drive up system cost. Such I/O cells also tend to consume substantial additional power, reducing thermal overhead within the memory devices and controller and driving up operational cost.
Another problem facing designers of modern memory systems is that the reduced supply voltages necessitated by shrinking process technologies are increasingly insufficient to drive signals across the lossy channel. That is, due to the losses incurred in the channel at higher frequencies, I/O drivers have been forced to stay at higher voltages (and therefore slower speeds) in order to maintain signal margins. Designers of memory system components are increasingly faced with this dilemma of difference between the voltage needs of the internal core logic in a semiconductor versus the more demanding voltage requirements of I/O circuitry.