1. Field of the Invention
The present invention is directed to integrated circuit memory design and, more particularly, to dynamic random access memory (DRAM) designs.
2. Description of the Background
1. Introduction
Random access memories (RAMs) are used in a large number of electronic devices from computers to toys. Perhaps the most demanding applications for such devices are computer applications in which high density memory devices are required to operate at high speeds and low power. To meet the needs of varying applications, two basic types of RAM have been developed. The dynamic random access memory (DRAM) is, in its simplest form, a capacitor in combination with a transistor which acts as a switch. The combination is connected across a digitline and a predetermined voltage with a wordline used to control the state of the transistor. The digitline is used to write information to the capacitor or read information from the capacitor when the signal on the wordline renders the transistor conductive.
In contrast, a static random access memory (SRAM) is comprised of a more complicated circuit which may include a latch. The SRAM architecture also uses digitlines for carrying
information to and reading information from each individual memory cell and wordlines to carry control signals.
There are a number of design tradeoffs between DRAM and SRAM devices. Dynamic devices must be periodically refreshed or the data stored will be lost. SRAM devices tend to have faster access times than similarly sized DRAM devices. SRAM devices tend to be more expensive than DRAM devices because the simplicity of the DRAM architecture allows for a much higher density memory to be constructed. For those reasons, SRAM devices tend to be used as cache memory whereas DRAM devices tend to be used to provide the bulk of the memory requirements. As a result, there is tremendous pressure on producers of DRAM devices to produce higher density devices in a cost effective manner.
2. DRAM Architecture
A DRAM chip is a sophisticated device which may be thought of as being comprised of two portions: the array, which is comprised of a plurality of individual memory cells for storing data, and the peripheral devices, which are all of the circuits needed to read information into and out of the array and support the other functions of the chip. The peripheral devices may be further divided into data path elements, address path elements, and all other circuits such as voltage regulators, voltage pumps, redundancy circuits, test logic, etc.
A. The Array
Turning first to the array, the topology of a modern DRAM array 1 is illustrated in FIG. 1. The array 1 is comprised of a plurality of cells 2 with each cell constructed in a similar manner. Each cell is comprised of a rectangular active area, which in FIG. 1 is a N+ active area. A dotted box 3 illustrates where one transistor/capacitor pair is fabricated while a dotted box 4 illustrates where a second transistor/capacitor pair is fabricated. A wordline WL1 runs through dotted box 3, and at least a portion of where that wordline overlays the N+ active area is where the gate of the transistor is formed. To the left of the wordline WL1 in dotted box 3, one terminal of the transistor is connected to a storage node 5 which forms the capacitor. The other terminal of the capacitor is connected to a cell plate. To the right of the wordline WL1, the other terminal of the transistor is connected to a digitline D2 at a digitline contact 6. The transistor/capacitor pair in dotted box 4 is a mirror image of the transistor/capacitor pair in dotted box 3. The transistor within dotted box 4 is connected to its own wordline WL2 while sharing the digitline contact 6 with the transistor in the dotted box 3.
The wordlines WL1 and WL2 may be constructed of polysilicon while the digitline may be constructed of polysilicon or metal. The capacitors may be formed with an oxide-nitride-oxide-dielectric between two polysilicon layers. In some processes, the wordline polysilicon is silicided to reduce the resistance which permits longer wordline segments without impacting speed.
The digitline pitch, which is the width of the digitline plus the space between digitlines, dictates the active area pitch and the capacitor pitch. Process engineers adjust the active area width and the resulting field oxide width to maximize transistor drive and minimize transistor-to-transistor leakage. In a similar manner, the wordline pitch dictates the space available for the digitline contact, transistor length, active area length, field poly width, and capacitor length. Each of those features is closely balanced by process engineers to maximize capacitance and yield and to minimize leakage.
B. The Data Path Elements
The data path is divided into the data read path and the data write path. The first element of the data read path, and the last element of the data write path, is the sense amplifier. The sense amplifier is actually a collection of circuits that pitch up to the digitlines of a DRAM array. That is, the physical layout of each circuit within the sense amplifier is constrained by the digitline pitch. For example, the sense amplifiers for a specific digitline pair are generally laid out within the space of four digitlines. One sense amplifier for every four digitlines is commonly referred to as quarter pitch or four pitch.
The circuits typically comprising the sense amplifier include isolation transistors, circuits for digitline equilibration and bias, one or more N-sense amplifiers, one or more P-sense amplifiers, and I/O transistors for connecting the digitlines to the I/O signal lines. Each of those circuits will be discussed.
Isolation transistors provide two functions. First, if the sense amplifiers are positioned between and connected to two arrays, they electrically isolate one of the two arrays. Second, the isolation transistors provide resistance between the sense amplifier and the highly capacitive digitlines, thereby stabilizing the sense amplifier and speeding up the sensing operation. The isolation transistors are responsive to a signal produced by an isolation driver. The isolation driver drives the isolation signal to the supply potential and then drives the signal to a pumped potential which is equal to the value of the charge on the digit lines plus the threshold voltage of the isolation transistors.
The purpose of the equilibration and bias circuits is to ensure that the digitlines are at the proper voltages to enable a read operation to be performed. The N-sense amplifiers and P-sense amplifiers work together to detect the signal voltage appearing on the digitlines in a read operation and to locally drive the digitlines in a write operation. Finally, the I/O transistors allow data to be transferred between digitlines and I/O signal lines.
After data is read from an mbit and latched by the sense amplifier, it propagates through the I/O transistors onto the I/O signal lines and into a DC sense amplifier. The I/O lines are equilibrated and biased to a voltage approaching the peripheral voltage Vcc. The DC sense amplifier is sometimes referred to as the data amplifier or read amplifier. The DC sense amplifier is a high speed, high gain, differential amplifier for amplifying very small read signals appearing on the I/O lines into full CMOS data signals input to an output data buffer. In most designs, the array sense amplifiers have very limited drive capability and are unable to drive the I/O lines quickly. Because the DC sense amplifier has a very high gain, it amplifies even the slightest separation in the I/O lines into full CMOS levels.
The read data path proceeds from the DC sense amplifier to the output buffers either directly or through data read multiplexers (muxes). Data read muxes are commonly used to accommodate multiple part configurations with a single design. For an x16 part, each output buffer has access to only one data read line pair. For an x8 part, the eight output buffers each have two pairs of data lines available thereby doubling the quantity of mbits accessible by each output. Similarly, for a x4 part, the four output buffers have four pairs of datalines available, again doubling the quantity of mbits available for each output.
The final element in the read data path is the output buffer circuit. The output buffer circuit consists of an output latch and an output driver circuit. The output driver circuit typically uses a plurality of transistors to drive an output pad to a predetermined voltage, Vccx or ground, typically indicating a logic level 1 or logic level 0, respectively.
A typical DRAM data path is bidirectional, allowing data to be both read from and written to the array. Some circuits, however, are truly bidirectional, operating the same regardless of the direction of the data. An example of such bidirectional circuits is the sense amplifiers. Most of the circuits, however, are unidirectional, operating on data in only a read operation or a write operation. The DC sense amplifiers, data read muxes, and output buffer circuits are examples of unidirectional circuits. Therefore, to support data flow in both directions, unidirectional circuits must be provided in complementary pairs, one for reading and one for writing. The complementary circuits provided in the data write path are the data input buffers, data write muxes, and write driver circuits.
The data input buffers consist of both nMOS and pMOS transistors, basically forming a pair of cascaded inverters. Data write muxes, like data read muxes, are often used to extend the versatility of a design. While some DRAM designs connect the input buffer directly to the write driver circuits, most architectures place a block of data write muxes between the input buffers and the write drivers. The muxes allow a given DRAM design to support multiple configurations, such as x4, x8, and x16 parts. For x16 operation, each input buffer is muxed to only one set of data write lines. For x8 operation, each input buffer is muxed to two sets of data write lines, doubling the quantity of mbits available to each input buffer. For x4 operation, each input buffer is muxed to four sets of data writelines, again doubling the number of mbits available to the remaining four operable input buffers. As the quantity of input buffers is reduced, the amount of column address space is increased for the remaining buffers.
A given write driver is generally connected to only one set of I/O lines, unless multiple sets of I/O lines are fed by a single write driver via additional muxes. The write driver uses a tri-state output stage to connect to the I/O lines. Tri-state outputs are necessary because the I/O lines are used for both read and write operations. The write driver remains in a high impedance state unless the signal labeled “write” is high, indicating a write operation. The drive transistors are sized large enough to insure a quick, efficient, write operation.
The remaining element of the data write path is, as mentioned, the bidirectional sense amplifier which is connected directly to the array.
C. The Address Path Elements
Up to this point we have been discussing data paths. The movement of data into or out of a particular location within the array is performed under the control of address information. We next turn to a discussion of the address path elements.
Since the 4 Kb generation of DRAMs, DRAMs have used multiplexed addresses. Multiplexing in DRAMs is possible because DRAM operation is sequential. That is, column operations follow row operations. Thus, the column address is not needed until the sense amplifiers for an identified row have latched, and that does not occur until sometime after the wordline has fired. DRAMs operate at higher current levels with multiplexed addressing, because an entire page (row address) is opened with each row access. That disadvantage is overcome by the lower packaging costs associated with multiplexed addresses. Additionally, because of the presence of the column address strobe signal (CAS*), column operation is independent of row operation, enabling a page to remain open for multiple, high-speed, column accesses. That page mode type of operation improves system performance because column access time is much shorter than row access time. Page mode operation appears in more advanced forms, such as extended data out (EDO) and burst EDO (BEDO), providing even better system performance through a reduction in effective column access time.
The address path for a DRAM can be broken into two parts: the row address path and the column address path. The design of each path is dictated by a unique set of requirements. The address path, unlike the data path, is unidirectional. That is, address information flows only into the DRAM. The address path must achieve a high level of performance with minimal power and die area, just like every other aspect of DRAM design. Both paths are designed to minimize propagation delay and maximize DRAM performance.
The row address path encompasses all of the circuits from the address input pad to the wordline driver. Those circuits generally include the row address input buffers, CAS before RAS counter (CBR counter), predecode logic, array buffers, redundancy logic (treated separately hereinbelow), row decoders, and phase drivers.
The row address buffer consists of a standard input buffer and the additional circuits necessary to implement functions required for the row address path. The CBR counter consists of a single inverter and a pair of inverter latches coupled to a pair of complementary muxes to form a one bit counter. All of the CBR counters from each row address buffer are cascaded together to form a CBR ripple counter. By cycling through all possible row address combinations in a minimum of clock pulses, the CBR ripple counter provides a simple means of internally generating refresh addresses.
There are many types of predecode logic used for the row address path. Predecoded address lines may be formed by logically combining (AND) addresses as shown in Table 1.
TABLE 1Predecoded address truth tablePR01RA <0>RA <1>PR01 (n)PR01 <0>PR01 <1>PR01 <2><3>0001000101010001200101130001The remaining addresses are identically coded except for RA<12>, which is essentially a “don't care”. Advantages to predecoded addresses include lower power due to fewer signals making transitions during address changes and higher efficiency because of the reduced number of transistors necessary to decode addresses. Predecoding is especially beneficial in redundancy circuits. Predecoded addresses are used throughout most DRAM designs today.
Array buffers drive the predecoded address signals into the row decoders. In general, the buffers are no more than cascaded inverters, but in some cases they may include static logic gates or level translators, depending upon the row decoder requirements.
Row decoders must pitch up to the mbit arrays. There are a variety of implementations, but however implemented, the row decoder essentially consists of two elements: a wordline driver and an address decoder tree. With respect to the wordline driver, there are three basic configurations: the NOR driver, the inverter (CMOS) driver, and the bootstrap driver. Just about any type of logic may be used for the address decoder tree. Static logic, dynamic logic such as precharge and evaluate logic, pass gate logic, or some combination thereof may be provided to decode the predecoded address signals.
Additionally, the drivers and associated decode trees can be configured either as local row decodes for each array section or as global row decodes that drive a multitude of array sections.
The wordline driver in the row decoder causes the wordline to fire in response to a signal called PHASE. Essentially, the PHASE signal is the final address term to arrive at the wordline driver. Its timing is carefully determined by the control logic. PHASE cannot fire until the row addresses are set up in the decode tree. Normally, the timing of PHASE also includes enough time for the row redundancy circuits to evaluate the current address. The phase driver can be composed of standard static logic gates.
The column address path consists of the input buffers, address transition detection (ATD) circuits, predecode logic, redundancy logic (discussed below), and column decoders. The column address input buffers are similar in construction and operation to the row address input buffers. The ATD circuit detects any transition that occurs on an address pin to which the circuit is dedicated. ATD output signals from all of the column addresses are routed to an equilibration driver circuit. The equilibration driver circuit generates a set of equilibration signals for the DRAM. The first of these signals is Equilibrate I/O (EQIO) which is used in the arrays to force equilibration of the I/O lines. The second signal generated by the equilibration driver is called Equilibrate Sense Amps (EQSA). That signal is generated from address transitions occurring on all of the column addresses, including the least significant address.
The column addresses are fed into predecode logic which is very similar to the row address predecode logic. The address signals emanating from the predecode logic are buffered and distributed throughout the die to feed the column decoders.
The column decoders represent the final elements that must pitch up to the array mbits. Unlike row decoder implementation, though, column decoder implementation is simple and straightforward. Static logic gates may be used for both the decode tree elements and the driver output. Static logic is used primarily because of the nature of column addressing. Unlike row addressing, which occurs once per RAS* cycle with a modest precharge period until the next cycle, column addressing can occur multiple times per RAS* cycle. Each column is held open until a subsequent column appears. In a typical implementation, the address tree consists of combinations of NAND or NOR gates. The column decoder output driver is a simple CMOS inverter.
The row and column addressing scheme impacts the refresh rate for the DRAM. Normally, when refresh rates change for a DRAM, a higher order address is treated as a “don't care” address, thereby decreasing the row address space, but increasing the column address space. For example, a 16 Mb DRAM bonded as a 4 Mb x4 part could be configured in several refresh rates: 1K, 2K, and 4K. Table 1 below shows how row and column addressing is related to those refresh rates for the 16 Mb example. In this example, the 2K refresh rate would be more popular because it has an equal number of row and column addresses, sometimes referred to as square addressing.
TABLE 2Refresh rate versus row and column addressesRefreshRowColumnRateRowsColumnsAddressesAddresses4K4096102412102K2048204811111K102440961012
D. Other Circuits
Additional circuits are provided to enable various other features. For example, circuits to enable test modes are typically included in DRAM designs to extend test capabilities, speed component testing, or subject a part to conditions that are not seen during normal operation. Two examples are address compression and data compression which are two special test modes usually supported by the design of the data path. Compression test modes yield shorter test times by allowing data from multiple array locations to be tested and compressed on-chip, thereby reducing the effective memory size. The costs of any additional circuitry to implement test modes must be balanced against cost benefits derived from reductions in test time. It is also important that operation in test mode achieve 100% correlation to operation of non-test mode. Correlation is often difficult to achieve, however, because additional circuitry must be activated during compression, modifying the noise and power characteristics on the die.
Additional circuitry is added to the DRAM to provide redundancy. Redundancy has been used in DRAM designs since the 256 Kb generation to improve yield. Redundancy involves the creation of spare rows and columns which can be used as a substitute for normal rows and columns, respectively, which are found to be defective. Additional circuitry is provided to control the physical encoding which enables the substitution of a usable device for a defective device. The importance of redundancy has continued to increase as memory density and size have increased.
The concept of row redundancy involves replacing bad wordlines with good wordlines. The row to be repaired is not physically replaced, but rather it is logically replaced. In essence, whenever a row address is strobed into a DRAM by RAS*, the address is compared to the addresses of known bad rows. If the address comparison produces a match, then a replacement wordline is fired in place of the normal (bad) wordline. The replacement wordline can reside anywhere on the DRAM. Its location is not restricted to the array that contains the normal wordline, although architectural considerations may restrict its range. In general, the redundancy is considered local if the redundant wordline and normal wordline must always be on the same subarray.
Column redundancy is a second type of repair available in most DRAM designs. Recall that column accesses can occur multiple times per RAS* cycle. Each column is held open until a subsequent column appears. Because of that, circuits that are very different from those seen in the row redundancy are used to implement column redundancy.
The DRAM circuit also carries a number of circuits for providing the various voltages used throughout the circuit.
3. Design Considerations
U.S. patent application Ser. No. 08/460,234, entitled Single Deposition Layer Metal Dynamic Random Access Memory, filed 17 Aug. 1995 and assigned to the same assignee as the present invention is directed to a 16 Meg DRAM. U.S. patent application Ser. No. 08/420,943, entitled Dynamic Random Access Memory, filed 4 Jun. 1995 and assigned to the same assignee as the present invention is directed to a 64 Meg DRAM. As will be seen from a comparison of the two aforementioned patent applications, it is not a simple matter to quadruple the size of a DRAM. Quadrupling the size of a 64 Meg DRAM to a 256 Meg DRAM poses a substantial number of problems for the design engineer. For example, to standardize the part so that 256 Meg DRAMs from different manufacturers can be interchanged, a standard pin configuration has been established. The location of the pins places constraints on the design engineer with respect to where circuits may be laid out on the die. Thus, the entire layout of the chip must be reengineered so as to minimize wire runs, eliminate hot spots, simplify the architecture, etc.
Another problem faced by the design engineer in designing a 256 Meg DRAM is the design of the array itself. Using prior art array architectures does not provide sufficient space for all of the components which must pitch up to the array.
Another problem involves the design of the data path. The data path between the cells and the output pads must be as short as possible so as to minimize line lengths to speed up part operation while at the same time present a design which can be manufactured using existing processes and machines.
Another problem faced by the design engineer involves the issue of redundancy. A 256 Meg DRAM requires the fabrication of millions of individual devices, and millions of contacts and vias to enable those devices to be interconnected. With such a large number of components and interconnections, even a very small failure rate results in a certain number of defects per die. Accordingly, it is necessary to design redundancy schemes to compensate for such failures. However, without practical experience in manufacturing the part and learning what failures are likely to occur, it is difficult to predict the type and amount of redundancy which must be provided.
Another problem involves latch-up in the isolation driver circuit when the pumped potential is driven to ground. Latch-up occurs when parasitic components give rise to the establishment of low-resistance paths between the supply potential and ground. A large amount of current flows along the low-resistance paths and device failure may result.
Designing the on-chip test capability also presents problems. Test modes, as opposed to normal operating modes, are used to test memory integrated circuits. Because of the limited number of pins available and the large number of components which must be tested, without some type of test compression architecture, the time which each DRAM would have to spend in a test fixture would be so long as to be commercially unreasonable. It is known to use test modes to reduce the amount of time required to test the memory integrated circuit, as well as to ensure that the memory integrated circuit meets or exceeds performance requirements. Putting a memory integrated circuit into a test mode is described in U.S. Pat. No. 5,155,704, entitled “Memory Integrated Circuit Test mode Switching” to Walther et al. However, because the test mode operates internal to the memory, it is difficult to determine whether the memory integrated circuit successfully completed one or more test modes. Therefore, a need exists for providing a solution to verify successful or unsuccessful execution of a test mode. Furthermore, it would be desirable that such a solution have minimal impact with respect to additional circuitry. Certain test modes, such as the all row high test mode, must be rethought with respect to a part as large as a 256 Meg chip because the current required for such a test would destroy power transistors servicing the array.
Providing power for a chip as large as a 256 Meg DRAM also presents its own set of unique problems. Refresh rates may cause the power needed to vary greatly. Providing voltage pumps and generators of sufficient size to provide the necessary power may result in noise and other undesirable side effects when maximum power is not required. Additionally, reconfiguring the DRAM to achieve a usable part in the event of component failure may result in voltage pumps and generators ill sized for the smaller part.
Even something as basic as powering up the device must be rethought in the context of such a large and complicated device as a 256 Meg DRAM. Prior art timing circuits use an RC circuit to wait a predetermined period of time and then blindly bring up the various voltage pumps and generators. Such systems do not receive feedback and, therefore, are not responsive to problems during power up. Also, to work reliably, such systems are conservative in the event some voltage pumps or generators operated more slowly than others. As a result, in most cases, the power up sequence was more time consuming than it needed to be. In a device as complicated as a 256 Meg DRAM, it is necessary to ensure that the device powers up in a manner that permits the device to be properly operated in a minimum amount of time.
All of the foregoing problems are superimposed upon the problems which every memory design engineer faces such as satisfying the parameters set for the memory, e.g., access time, power consumption, etc., while at the same time laying out each and every one of millions of components and interconnections in a manner so as to maximize yield and minimize defects. Thus, the need exists for a 256 Meg DRAM which overcomes the foregoing problems.