The present invention generally is directed to Non-volatile (NVM) NAND memory architecture design. In particular, this invention provides several novel VSL-based NVM NAND concurrent design methods, aiming particularly to greatly improve read and write speed, power consumption and the data reliability of an extremely high-density NAND without changing the existing NAND cell and process technology.
Electrically erasable and programmable NAND, NOR, EEPROM and the likes are among the most popular NVMs. Particularly, NAND is extensively used with a big volume in cellular phones, digital cameras, personal digital assistants, mobile computing devices, tablet, SSD and desktop computers and other emerging wearable devices.
Typically, the mainstream 2D nLC NAND flash memories utilize a 2-poly NMOS memory cell with a floating gate that is provided above and insulated from a channel region in a triple-P-well within a deep-N-well on top of common P-substrate across the cell array region. The floating gate is made of a poly-silicon material (so-called poly1) and is positioned on top and between N-active source and drain regions. A control gate is made of another poly-silicon material (poly2) and provided over and insulated from the poly1 floating gate. The threshold voltage (Vtn) of each nLC flash cell is controlled by the amount of charges that are retained on the poly1 floating gate layer. In other words, a nLC cell's Vtn check means the minimum control gate voltage, e.g., VWL (Vg) voltage, that must be biased with respect to its source node voltage, Vs, to turn on the cell to allow the current conduction of IDS flowing between its drain (with voltage level at Vd) and source (with voltage level at Vs) to meet a condition of Vtn check equation of Vgs=Vg−Vs>Vtn or Vgs−Vtn>0. On the contrary, when the condition becomes Vgs−Vtn<0, then the selected flash cell would not conduct current. In other words, the cell is verified not in the current Vtn state, and it should be in Vtn+1 state, e.g., one or more high charge states with a larger Vtn.
Throughout this specification, a nLC NAND flash cell used to store two ranges of charges is referred as 1-bit, 2-state (Vtn, where n=1) SLC cell; to store four ranges of charges are referred as a 2-bit, 4-state (Vtn, where n=2) MLC cell; to store eight ranges are referred as a 3-bit, 8-state (Vtn, where n=3) TLC cell; and to further store the sixteen ranges of charges are referred as a 4-bit but 16-state (Vtn, where n=4) XLC cell. When a floating-gate of each NAND cell is used to store more than 16 ranges of charges such as 256 states (Vtn, where n=8) is referred as the 8-bit analog cell.
As a result, in a NAND nLC read or any verification operation, by determining which Vtn of a NAND cell conducts current at a given Vgs on WLn under a circumstance of no Yupin coupling interference between adjacent wordlines (WLs) and bit lines (BLs), then each Vtn of nLC (MLC or TLC) states of each accessed cell can be fully distinguished and determined. Note, the verification includes program-verify and erase-verify operations.
Unfortunately, a typical NAND array is usually formed in a very compact memory matrix to squeeze die size smaller. In All-bitline (ABL) or non-ABL NAND read and verification operations, a plurality of NAND cells with one cell per each string of one physically WLn are selected simultaneously. This means only one cell is read out from one long and compact NAND string that comprises a plurality of NAND cells being laid out in a highly tight 1-lambda (1λ) BL width and 1-lambda (1λ) spacing in X-direction and very tight 1λ WL width and 1λ spacing in Y-direction.
When NAND technology scaling comes to below 30 nm, or even down to 10 nm-class range, the floating-gate Vt interfering coupling effect becomes very severe between adjacent BLs and WLs. These are the well-known Yupin BL-BL or WL-WL cell coupling effects. The Yupin coupling effect will result in the nLC data reliability issue of unintentional errorus reading, which is undesired but in fact not avoidable.
For example, the typical NAND technology node of 30 nm, the degree of total Yupin coupling effect is less than 30% averagely between two adjacent WLs and two adjacent BLs. When it comes to 20 μm node, the degree of total Yupin coupling effect is increased to about 35% averagely. By extrapolation, the degree of total Yupin coupling effect will be further increased to a value more than 40% averagely if the isolation techniques do not get improved.
Typical NAND cell suffered Yupin coupling effect is referred as a “Victim cell or BLn cell in WLn”, while the cells that generate Yupin coupling effects are referred as “Aggressor cell or two BLn−1 or BLn+1 cells in WLn or three BLn−1, BLn and BLn+1 cells in WLn−1 and WLn+1.” Usually, one Victim cell is surrounded by eight Aggressor cells in 2D NAND array but twenty-six Aggressor cells in 3D NAND array.
Ultimately, in 2D NAND, each nLC Victim cell will be surrounded by eight Aggressor cells with 2n possible Vtn values. In other words, the total combinations of Yupin coupling effect are 8×2n. But if the Yupin coupling effects of four diagonal Aggressor cells are not significant and the WLn−1 cell's coupling effect being taken care during WLn program because WLn−1 is being programmed before WLn, then the combinations of major significant Yupin coupling effect can be reduced to 3×2n by three Aggressor cells such as two cells of BLn−1 and BLn+1 in WLn and one cell of BLn in WLn+1.
In summary, for both NAND read and verify operations, a cell's Vtn compensation to offset Yupin coupling effect to fix error-correcting code (ECC) errors is required.
Although in past years, there are plenty of Vtn compensation techniques being disclosed in prior art, all of them are more like the “Collective Vt-compensation” or “Pseudo Individual Vt-compensation (PIC)” solutions that rely on cell's VWL-based or VBL-based Vt-offset scheme. None of them are really based “Real Individual Vt-compensation (RIC)”, which is referred as the VSL-based Vt-offset compensation scheme by the present invention.
For example, in a conventional mainstream NAND memory block circuit of 2D array architecture, each NAND block typically is made of a plurality of NAND strings with their individual drain nodes being connected to a plurality of bit lines (BLs) which can be divided into Even BL group (BLe) and Odd BL group (BLo) with their source nodes being connected to one common source line (CSL). The gates of a plurality of NAND cells (plus some dummy cells) in each string are respectively connected to different WLs. Each NAND string includes one top big select NMOS transistor gated to a DSL line and one bottom big select NMOS transistor gated to a SSL line. Additionally, dummy cells and regular NAND cells are formed in series with these two select transistors. The dummy cells are formed at both ends of each string nearing the top and bottom big select transistors for the purpose to avoid gate-induced-drain-leakage (GIDL) effect that results in higher Vt of regular cells of top and bottom WLs.
In such NAND block structure, the tight 1λ-width and 1λ-spacing of all BLe and BLo are laid as metal lines at M1 level in parallel in Y-direction and are perpendicular to all CSLs laid as different metal lines at M0 level (M0 being lower than M1) in X-direction. There is no individual SL line formed for each individual BL for each NAND string.
A method of program and read nLC cells in this conventional NAND array is referred as ABL program and read, in which all nLC NAND cells in all strings in each selected physical WLn are programmed and read at same time as an advantage but at expense of 2-fold PB size. One bit of PB is connected to one corresponding bit of nLC cell formed in each physical WLn.
Another method of program and read based on above conventional NAND array is Odd/Even-BL or SBL (Shielded BL) read and program-verify. In this method, only one half of interleaving nLC cells of ½ of all BLs at each physical WLn of either Odd-BL group or Even-BL group are selectively programmed and read at same time with a benefit of just one-half PB size of the ABL method mentioned earlier. One bit of PB is connected to two bits of nLC cells of two BLs through one Odd/Even column decoder. However, this is not a perfect BL-shielding method as the BL-BL coupling effect still happens, causing penalties of 2-fold latency of read and program-verify operation, 2-fold Vpass and Vread WL gate disturbance to degrade P/E endurance cycle data reliability of NAND products, and 2-fold power consumption of read, program and verify due to 2 times of half-page size access operations. On the other hand, although the ABL method has superior nLC performance and reliability over the Odd/Even-BL approach but it has a penalty of 2× area size in PB.
In another example, U.S. Pat. No. 5,734,609 disclosed one non-mainstream paired 2D NAND string in which BL node of Even/Odd string is connected in a zigzag way to each corresponding SL node of next adjacent Odd/Even string. Two different metal lines are used for two adjacent BLs in parallel in Y-direction and are fully symmetrical in terms of layout and electric operations. There is no common horizontal SL metal line running in X-direction in each NAND block. Each NAND string is formed to have its individual BL and uses each physically adjacent BL as its individual SL. However, this still is not a perfect SBL scheme to guarantee BL-coupling free operation. Each NAND-string size is larger than the mainstream NAND-string of last example because one extra big 1-poly Depletion-type select transistor is added to the left string and another big Depletion-type NMOS select transistor is added to the right string respectively. These paired Depletion-type NMOS transistors form a pair of Odd and Even select transistors, which are laid out with a bigger channel length and size as the regular Enhancement-type transistor.
In yet another example, U.S. Pat. No. 8,695,943 disclosed a non-mainstream NAND scheme in which BL and SL lines are also laid out in parallel in Y-direction but not connected in a zigzag way between the drain and source nodes of two physically adjacent strings and no horizontal SLs are required. Again, each NAND-string size is formed larger than the one made of the mainstream NAND-string scheme by adding one extra big 2-poly floating-gate device in an even string and a similar big 2-poly floating-gate device in an odd string. Each of these added 2-poly floating-gate devices is laid out with the same big channel length as 1-poly enhancement-type select transistor. The read and verify operations of this NAND string is pretty much same as the last example but with disadvantages of requiring additional erase, program and verification on these large select transistors. Both interleaving BL and SL lines are formed with only one metal layer. As a result, the BL-BL coupling cannot be avoided and the quality and yield of the preferred ABL nLC program would be highly jeopardized.
In yet still another example, U.S. Pat. No. 7,499,329 disclosed another non-mainstream NAND array in which both BL and SL are also laid out in parallel in Y-direction and connected in a zigzag way between the drain and source nodes of two physically adjacent paired strings and each BL line is shared by one paired Odd and Even strings by the proper logic selection of SELECT lines. Both BL and SL lines are formed interleavingly with only one tight-pitch metal layer. Again, the disadvantage of this array is that two extra large 1-poly Enhancement-type select transistors have to be added to each paired strings. As a result, there is no perfect SBL effect and the BL-BL coupling cannot be avoided and the quality and yield of the preferred ABL nLC program would be highly jeopardized.
In summary, there is a strong need to improve NAND array architecture without using extra large string-select transistors or any sort and having a plurality of separate BL and SL lines in parallel without any common SL in the selected NAND block by using adjacent BL as an individual SL biased with an individual VSL to allow the preferred VSL-based Vt-compensation to be implemented. Further, it is desired to have a Fine program and an alternating-WL program applied together with the VSL-offset mixed scheme have to be used to make a final narrow-Vt program states for more reliable read and verification. As the results, the improvement should allow batch-based multiple All-BL (ABL) and All-Vtn-Program (AnP) program, read, and verify operations to be performed in a same NAND plane for dramatic reduction of latency and power consumption and number of row-decoders and PB needed so that less errorus reading can be achieved without need of sophisticate ECC schemes and algorithms for less Read latency and power consumption.