Semiconductor memories can be generally divided into two broad categories based on their volatility, or ability to retain data in the absence of supplied power. Volatile memory can be sub-classified based on the method of data retention used, dynamic or static. Dynamic volatile memory, known as Dynamic Random Access Memory (DRAM), requires periodic refreshing to retain data, while static volatile memory, known as Static Random-Access Memory (SRAM), does not. Non-volatile memory can be sub-classified based on whether the memory is writable: memory that can be written to only once versus memory that can be rewritten. One-time write memories include mask programmable and fuse programmable Read-Only Memory (ROM), and rewritable memories include Erasable Programmable Read-Only Memory (EPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), flash memory, and Ferroelectric Random-Access Memory (FeRAM), among others. The different types of semiconductor memory are summarized in J. M. Rabaey et al., Digital Integrated Circuits: A Design Perspective. Prentice Hall, Upper Saddle River, N.J., 2nd edition, 2003.
DRAM is the most dense, and therefore least expensive per bit, of all types of semiconductor memory that can be written during system operation. It offers good read and write performance, but consumes a fairly substantial amount of power compared with other memories. SRAM is more expensive per bit than DRAM, but offers the best performance of rewritable semiconductor memory. Non-volatile memories, in general, have reasonably high density, and often have similar read performance to DRAM; however, most presently available rewritable non-volatile memories have poor write performance. Furthermore, they are only capable of a limited number of rewrites, which prevents them from competing with DRAM as an inexpensive, well-performing memory for computing systems.
A wealth of further information on the various types of semiconductor memories and their operation can be found in K. Itoh, VLSI Memory Chip Design. Springer-Verlag, Berlin, 2001.
DRAM
DRAMs are inexpensive in terms of cost per bit. At the same time, DRAMs offer relatively low latency access combined with a good throughput rate. These characteristics are the result of a design philosophy that focuses first on creating a memory that is as dense as possible, and then on encapsulating it in an architecture that maximizes performance.
The salient characteristic of a modern DRAM is the use of the one transistor and one capacitor (1T1C or sometimes just 1T) unit storage cell. This cell was disclosed by Robert H. Dennard of IBM in 1968. The 1T1C cell can be designed in an extremely small area on an integrated circuit, allowing a large number of cells on a single chip, so that the resulting product has a low production cost per bit. The challenge in using the 1T1C cell is operating it in a large scale array with good performance. This requires an elaborate organization scheme that is described later in this background.
The 1T1C cell stores a bit by representing that bit as a charge stored across the cell capacitor. A positive charge represents a logic “high,” while a negative charge represents a logic “low.” The access transistor functions as a switch, controlled by the wordline, that couples the cell capacitor to the bitline.
To store a bit in a cell, a “high” voltage level is applied to the wordline connected to the desired cell. At the same time, the voltage level to be stored is applied to the bitline that is connected to the desired cell. With the access transistor active, the cell capacitor is charged from the bitline with a “high” or “low” charge. The access transistor is then deactivated and the stored charge remains on the cell capacitor.
To read a bit from the cell, the capacitive bitline is left floating at a precharge voltage (usually Vdd/2), and the access transistor is activated. If a “high” voltage level is stored in the cell, then the bitline voltage will increase; if a “low” voltage level is stored, then the bitline voltage will decrease. Either way, a sense amplifier on the bitline detects the change in voltage and amplifies this voltage to a full logic “high” or “low” level. Because the access transistor is still open when this amplification occurs, the cell that was read is restored to its full original logic level. Once this restoration is complete, the access transistor is deactivated, and a new operation can begin.
The extent to which the bitline voltage increases or decreases during a read operation is determined by capacitive charge sharing. Both the bitline and memory cell have a fixed capacitance, with the bitline capacitance normally being five to ten times larger than that of the memory cell. When the access transistor is activated for a read operation, the charge on the cell capacitor is shared with the charge on the bitline to generate a change in bitline voltage given by dV=(Vcell−Vpre)×(Cs÷(Cb+Cs)), where dV is the change in bitline voltage due to charge sharing, Vcell is the stored cell voltage, Vpre is the bitline precharge voltage, Cs is the cell capacitance, and Cb is the bitline capacitance. Because the bitline capacitance is much larger than the cell capacitance, the value of dV is normally relatively small. For that reason, the sense amplifier has to be very sensitive to small changes in voltage in order to adequately detect stored logic levels.
DRAM operation is thoroughly described in B. Keeth and R. J. Baker, DRAM Circuit Design: A Tutorial, IEEE Press, Piscataway, N.J., 2001.
DRAM Architecture and Organization
Modern DRAMs are partitioned into core and periphery regions. The core region consists of the memory cells along with supporting circuits that are repeated at a frequency equal to an integer multiple of that of the bitlines or wordlines. The periphery region consists of control circuitry, I/O pads, data buffers, synchronization circuitry, voltage conversion circuitry, and other circuits whose functions relate directly to the specific architecture in which they are employed. The core regions are sub-organized into an array region, which contains the memory cells themselves, and another region that contains sense amplifiers, hierarchical wordline drivers and bitline twist strips.
The memory array is a two-dimensional array of memory cells, with wordlines running parallel in one dimension (normally referred to as the “X” dimension) and bitlines running parallel in the other dimension (normally referred to as the “Y” dimension), such that wordlines and bitlines are perpendicular to each other. Due to this organization, the group of cells connected to a single wordline is often referred to as a “row,” and the group of cells connected to a single bitline or a bitline pair is often referred to as a “column.”
There are two bitline structures used in modern DRAMs. Those are the open bitline structure, originally introduced in Karl Stein et al., Storage array and sense/refresh circuit for single-transistor memory cells, IEEE journal of Solid-State Circuits, SC-7(5):336-340, October 1972, and the folded bitline structure, disclosed by Robert F. Harland in 1977. Every commercial DRAM produced today uses one of these two bitline organizations, or else a direct variant or a hybrid of the two.
The open bitline organization, also referred to as “crosspoint” organization, is a simple scheme in which a memory cell resides at every intersection of a wordline and a bitline. In an open bitline structure, each bitline within an array is independent of its neighbors and is connected to a separate sense amplifier. The folded bitline organization, on the other hand, is a scheme in which a memory cell resides only at every second intersection between a wordline and a bitline. Each pair of adjacent bitlines in a folded array is connected to a single sense amplifier.
The memory core is organized hierarchically, with sub-arrays as the base elements. The capacitance of a bitline needs to be minimized so that a sufficiently strong signal is developed during read. Furthermore, the wordline and bitline capacitance both need to be minimized so as to minimize power consumption and improve performance. To accomplish this, the memory array is broken into sub-arrays that each have their own supporting circuitry. A typical sub-array in a 1-Gb DRAM with folded bitlines has 256 cells per bitline and 512 cells per “sub-wordline”, with a few additional of both types of line to provide static redundancy and to avoid photolithographic problems. The term “sub-wordline” is used to describe wordline segments so as to distinguish them from global wordlines that run across multiple sub-arrays.
Mated with each sub-array is a block of sense amplifiers, and, when a hierarchical wordline scheme is used, a local row decoder block. Each of these sub-array units with their surrounding logic is repeated numerous times in two dimensions to form a block of the core. Each core block then has associated column decode, global row decode, and control logic involved in controlling data flow and moving data from the core to the periphery.
Sensing Techniques
One of the most defining aspects of a DRAM design is the sensing technique used to read (and restore) data from a memory cell. This section briefly presents the most relevant sensing techniques.
The earliest DRAMs used three-transistor memory cells with separate bitlines for reading and writing, and employed single-ended (non-differential) sense amplifiers, as described in W. M. Regitz and J. A. Karp, Three-transistor-cell 1024-bit 500-ns MOS RAM, IEEE Journal of Solid-State Circuits, SC-5(5):181-186, October 1970. The sense amplifier design that supported the two-bitline columns consisted of four transistors: one for precharging the “write” bitline, a column enable transistor, a bias transistor, and a sensing transistor whose gate connected to the “read” bitline. When a cell was selected and a signal was transferred to the “read” bitline, the sense transistor would drive a near zero current to an output pin to indicate a logic “high” or a 400-μA current to indicate a logic “low.” Today, single-ended sensing is used very little, since it does not allow sufficient performance for most modern memories.
The most popular sensing technique in modern DRAMs is differential voltage sensing. The advantages of using differential sense amplifiers are that they are non-inverting, they reject common mode noise, they are insensitive to process variations, they are very sensitive to small signals, and they can operate very quickly. Furthermore, by using positive feedback in a differential amplifier, reading and restoring is merged into a single operation, and performance is improved.
Another sensing technique called “direct sensing” separates the read (output) lines from the write (input) lines, as described in K. Itoh, VLSI Memory Chip Design, Springer-Verlag, Berlin, 2001. It uses a conventional differential voltage sense amplifier to detect the stored cell voltage; however, it decouples the I/O lines from the bitlines, allowing significantly faster sensing at the expense of additional area.
Another sensing technique called “time multiplexed sensing” allows performance to be sacrificed for a substantial gain in density and some improvement in noise rejection. In time multiplexed sensing, one differential sense amplifier services multiple bitline pairs. Each pair is sensed (and restored) in sequence, and the result of each sense operation is transferred out of the memory core via a global bitline or data bus. This sharing of a single sense amplifier between a group of bitlines significantly reduces the sense amplifier area in a DRAM, which in turn reduces the area of the DRAM chip as a whole. This technique is described in T. Hasegawa et al., An experimental DRAM with a NAND-structured cell, IEEE Journal of Solid-State Circuits, 28(11):1099-1104, November 1993.