This invention generally relates to embedded dynamic random access memory, and more particularly to a cell structure formed by two transistors and two capacitors to be used in a system on-chip embedded dynamic random access memory (DRAM).
For several decades, one-transistor DRAMs have been the dominant choice for high density and low cost semiconductor memory in computing systems. Recently, advances in miniaturization have allowed integrating a processor on the same chip as the DRAM. Embedding a DRAM on the same chip alongside with the processor not only reduces packaging cost, but it also increases significantly the processor to memory available bandwidth. Because of the smaller memory cell size, embedded DRAMs can be approximately three to six times denser than embedded Static Random Access Memories (eSRAM) operating at lower power dissipation and 1000 times improved soft-error rate.
Conventional embedded DRAM cells typically employ for commodity DRAMs a one-transistor and one-capacitor dynamic random access memory cell (1T 1C DRAM). FIGS. 1A and 1B. respectively, show a transistor level schematic and a cross-section thereof. FIG. 1A illustrates two DRAM cells 10A and 10B, each DRAM cell consisting of an NMOS transistor 11A (or 11B) and a capacitor 12A (or 12B). Because of its simplicity, the DRAM cell size is small as one-sixth to one-tenth the size of an SRAM cell having six transistors. The capacitor 12A or 12B makes use of a trench structure coupled to storage node 13A (or 13B), as illustrated in FIG. 1B that shows a perspective view of the same transistor level representation depicted in FIG. 1A Alternatively, a planar or stack capacitor structure may be used as a cell, which is well known in the art, and as such will not be discussed further. When wordline WLA (or WLB) is activated, the NMOS 11A (or 11B) is coupled to capacitor 12A (or 12B) to bitline BLA (or BLB)through bitline contact 14A(or 14B).This creates a small BL voltage due to a charge sharing effect between capacitor 12A (or 12B) and BLA (or BLB). Note that charge sharing destroys the data bit in capacitor 12A (or 12B) (destructive read). The second BL (i.e., BLB) of the pair keeps BL at a pre-charge voltage. and is used as the reference bitline. Bitline sense amplifier 15 is coupled to the BL pair for reading and writing the data bit back to the capacitor. The small BL voltage created on the bitline pair (BLA and BLB) is amplified by the sense amplifier. When the NMOS column switch (16A and 16B) is enabled by column select signal CSL, the sense amplifier will also control the data pair (DLA and DLB). The sense amplifier drives the bitline pair according to the sensing result, allowing to rewrite the read data bit to capacitor 12A (or 12B) (write back). During a write mode operation, the bitline pair BLA and BLB is driven by the data line pair (DLA and DLB) through the NMOS column switches (16A and 16B) to either low and high or, vice versa, to high and low, depending on the data pattern. Typically, a write mode is enabled alter a read operation, because only selected cells are found to be in the write mode while other cell data bits are destroyed when activating wordline WL (also referred to destructive write). The destroyed data bits must be written back by the sense amplifier simultaneously with the write data bits (read modified write). The destructive read followed by a write back, and a read modified write caused by the destructive write require a longer cycle time than the one provided by the SRAM cell. The improved performance of the conventional embedded DRAMs is negligible and is limited by the read modified write operation when it is compared to that of a similar operation in a commodity DRAM. However, because of the inherent high density, they are successfully employed for graphic applications.
In order to enhance their advantage, embedded DRAMs have adopted architectural changes from their stand-alone counterparts in order to improve the bandwidth, latency and memory cycle time. Because the I/O width of the embedded DRAM in an embedded system can be very large, the page mode operation that is commonly used in commodity DRAMs does not improve its performance. Instead, improved random access time (or latency) and cycle time (or address bandwidth) is paramount to boosting the system performance.
Random access performance improvement was first addressed by utilizing a short bitline and wordline array, also referred to as a micro-cell architecture. The micro-cell architecture is discussed in great detail in the article by T. Kimura et al, 64 Mb 6.8 ns random row access DRAM macro for ASICs, published in ISSCC Digest of Technical papers, pp. 420-421, 1999. In order to further improve the random access performance, 2-port memory cells have been proposed and successfully implemented, as will be described hereinafter.
FIG. 2 shows a transistor level schematic for a conventional 2-port dynamic memory cell. It consists of two NMOS switching transistors 21 and 22, and one capacitor 23 (the combination being referred to as 2T 1C cell). The gates of NMOS switching transistors 21 and 22 are coupled to two separated wordlines WL0 and WL1. By activating both WL0 and WL1, the memory cells, respectively coupled to WL0 and to WL1 can be simultaneously read or written through the corresponding bitlines BL0 and BL1. The simultaneous read or write feature of the 2-port memory cell uses the two ports in an interleaving manner, resulting in halving the cycle time. Alternatively, one of the two ports may be used to perform a refresh operation to completely hide the refresh operation. The 2T 1C dual port cell is particularly useful for network applications because of its fast random access cycle time over a 1T 1C DRAM cell. A dual-port function is also an important feature for cache applications. Details of the 2-port memory cell and structure are disclosed in the article by Y. Agata et al., An 8-ns Random Cycle Embedded RAM Macro with Dual-port Interleaved DRAM Architecture, published in the IEEE Journal of Solid States Circuits, vol. 35, No. 11, pp. 1668-1672, November 2000.
System level integration is known to be an important requirement to construct a system on-chip (SOC) with embedded DRAMs. In a true system-on chip design, the graphic memory, network memory and cache memory need to be integrated in the same chip to enable compatibility of the process technology that supports various kinds of memories. It is not known in the art how to enable process compatibility when integrating a 1T 1D cell and 2T and 1T cell on a single chip. The existing 2-port memory cell successfully improves the random access performance. However, the 2-port memory cell creates an incompatibility with existing 1T and 1C memory cells, resulting in limiting the use of the 2-port memory cell to only the system and without using a 1T and 1C memory cell.