1. Field of the Invention
This invention relates generally to memory systems and memory interconnections in electronic systems. More particularly, the invention relates to high speed interconnection of daisy-chained memory chips.
2. Description of the Related Art
Modern computer systems typically are configured with a large amount of memory in order to provide data and instructions to one or more processors in the computer systems.
Historically, processor speeds have increased more rapidly than memory access times to large portions of memory, in particular, DRAM memory (Dynamic Random Access Memory). Memory hierarchies have been constructed to reduce the performance mismatches between processors and memory. For example, most modern processors are constructed having an L1 (level 1) cache, constructed of SRAM (Static Random Access Memory) on a processor semiconductor chip. L1 cache is very fast, providing reads and writes in only one, or several cycles of the processor. However, L1 caches, while very fast, are also quite small, perhaps 64 KB (Kilobytes) to 256 KB. An L2 (Level 2) cache is often also implemented on the processor chip. L2 cache is typically also constructed of SRAM design, although some processors utilize DRAM design. The L2 cache is typically several times larger in number of bytes than the L1 cache, but is slower to read or write. Some modern processor chips also contain an L3 (Level 3) cache. L3 cache is capable of holding several times more data than the L2 cache. L3 cache is sometimes constructed with DRAM design. L3 cache in some computer systems is implemented on a separate chip or chips from the processor, and is coupled to the processor with wiring on a printed wiring board (PWB) or a multi chip module (MCM). Main memory of the computer system is typically large, often many GB (gigabytes) and is typically implemented in DRAM.
Main memory is typically coupled to a processor with a memory controller. The memory controller receives load (read) commands and store (write) commands from the processor and services those commands, reading data from main memory or writing data to main memory. Typically, the memory controller has one or more queues (e.g., read queues and write queues). The read queues and write queues buffer information (e.g., commands, addresses, data) so that the processor can have multiple read and/or write requests in progress at a given time.
In various implementations, signaling between the memory controller and the memory chips comprise multidrop connections. That is, a pin on the memory controller connects directly to a plurality of memory chip pins (e.g., DRAM chip input or output or common I/O connection) It will be understood that typically one memory chip is placed on one module, so the connection to a particular memory chip includes a module pin plus the chip pin. Occasionally, several memory chips are placed on a single module which creates multiple drops even on a single module.
Another approach uses point to point interconnections between the memory controller and a buffer chip, the buffer chip being associated with a number of memory chips and accessing (writing/reading) to/from those associated chips when the buffer chip receives an address on the point to point interconnect from the memory controller. If the address received does not address the memory chips associated with the buffer chip, the buffer chip re-drives the command/address, and perhaps data, to another buffer chip.
FIG. 1 illustrates such a prior art memory structure. Memory controller 12 is coupled to a first point to point interconnection 18A, comprising “M” bits to a first buffer chip 20A. First point to point interconnection 18A carries address and command information. Memory controller 12 is coupled to a second point to point interconnection 19A, comprising “N” bits, to the first buffer chip 20A. Buffer chip 20A is mounted on a carrier 16A. Also shown mounted on carrier 16A are eight memory chips 14. Buffer chip 20A, as described above, receives address and command information on first point to point interconnect 18A. If buffer chip 20A determines that the address received addresses data in an address space of carrier 16A, buffer chip 20A drives address and control information on multidrop interconnection 21A. Data is typically sent on multiple, point to point interconnections between buffer chip 20A and memory chips 14 as shown on point to point connections 22 (four such point to point connections are referenced with numeral 22, for simplicity, others are not explicitly referenced). If, however, buffer chip 20A determines that the address received on first point to point interconnect 18A does not address the address space of carrier 16A, buffer chip 20A retransmits the address and command on point to point interconnect 18B to a second buffer chip 20B. Buffer chip 20B is mounted on carrier 16B and is coupled to memory chips 14 on carrier 16B. If buffer chip 20B determines that the address is not for an address space on carrier 16B, buffer chip 20B further re-drives the address and command on point to point interconnect 18C to a third buffer chip (not shown). If buffer chip 20B determines that the address is for the address space on carrier 16B, buffer chip 20B drives address and control information on multidrop interconnection 21B.
Data is sent, as described above, on point to point interconnections 22 between buffer chip 20B and memory chips 14 on carrier 16B (as before, four point to point connections 22 shown referenced). Thus, the address and command data is “daisy-chained” from one buffer chip 20 to another, with the appropriate buffer chip reading or writing data from/onto point to point interconnects 19 (shown as 19A-19C in FIG. 1). A problem with this approach is that buffer chips are required. Buffer chip 20 takes up area on carrier 16, and dissipates power. In electronic packaging and system design, area and power consumption are typically desired to be minimized. Buffer chips also add cost to a memory system. Yet another problem in this implementation is that a first period of time (one or more cycles) is used to drive the address and command to a buffer chip and a second period of time (one or more cycles) is then used to drive the address on a carrier (e.g., carrier 16). Driving signals on carrier interconnect, such as copper wiring on a printed wiring board (PWB) requires significant area on the buffer chip for the off chip driver, and associated ESD (electrostatic discharge) circuitry. Ensuring that the chip-module-carrier-module-chip path is operational, and providing for diagnosis of faulty signaling paths, also often requires that some or all pins be driven by a common I/O circuit that can both drive and receive, thus increasing the size and complexity of the circuitry that drives (or receives).
Current memory systems comprise one or more data busses that carry write data from the memory controller to the memory chips for storing in arrays in the memory chips. The data busses also carry read data to the memory controller from the memory chips. The data busses of current memory systems have a fixed bandwidth for write data versus read data. Different workloads in computer systems require different read bandwidths versus write bandwidths. For example, a numerically intensive workload is best served with a read bandwidth approximately the same as a write bandwidth. In contrast, a commercial database workload requires more read bandwidth than write bandwidth. Current memory systems are unable to accommodate different read versus write bandwidths other than by running at least some workloads inefficiently.
Therefore, there is a need for further improvement in a fast and efficient memory system.