NAND memory is well known, most popular and cheapest NVM memories in the art that provides the desired in-system or in-circuit repeatedly electrically programmable and erasable functions with only a single low-voltage Vdd supply. Up to 2013, NAND memory has 2D or 3D types, of them are achieving the same highest density up to 128 Gb MLC per die. Micron's 128 Gb MLC 2D NAND flash is produced by using the most advanced 16 nm (1y-node) planner NAND cell to solve the scalability challenge below 2×nm, while Samsung's 128 Gb MLC 3D NAND flash uses a less-advanced node of 3×nm to achieve the same high density but with another kind of technology challenge such as aspect ratio of making a stacked 24-layer cells. In overall NAND industry perspective, the consensus of future NAND is that the 2D NAND flash can be further scaled down to 10 nm (1z-node) to achieve NAND density of near 1 Tb MLC, while the 3D NAND flash can be scaled down at most around 2×nm node to achieve the density above 1 Tb MLC for a lower die cost without using the costly fab facilities of 10 nm lithography machines as required for 2D NAND flash design.
Currently, when NAND is compared to other NVM memories such as OTP, ROM, EEPROM and NOR, 2D NAND flash memory has achieved the highest capacity of 128 Gb and scalability with the smallest feature of 16 nm node in 2013. The mainstream standalone 2D NAND in mass production is mainly based on 2-poly floating-gate NMOS cell device, while the 3D NAND flash is using the charge-trapping MONOS-type 3D NAND cells. One common feature of both 2D and 3D NAND is that both use same FN-channel tunneling program to increase NAND cell's Vt.
So far, the product design specs of both 2D and 3D NAND remain pretty much the same as Block Erase and Block Erase-Verify, Page Read, Page Program-Verify. In addition, both 2D and 3D NAND employ compatible 20V VPP on NAND cell's gate vs. cell's Vss channel for the same FN-channel Program operation. In terms of the preferred Erase operation, some new 3D NAND flash memories are using the GIDL effect to generate the hot-hole current to perform Block erase operation, rather than using the conventional FN-channel tunneling scheme in 2D NAND flash design. But some 3D NAND technology are still employing FN-channel tunneling scheme to perform the same Block Erase operation.
Either 2D or 3D NAND memory arrays comprise a plurality of NAND Blocks connected by a plurality of global long metal1 GBL lines and are organized in a matrix with a plurality of rows and columns. Each Block is further comprised of a plurality of NAND strings. Each string is further comprised of a plurality of NMOS NAND cells connected in series and is sandwiched by two NMOS 1-poly string-select transistors such as one MS located on a string top and one MG on a string bottom. The number of 2D NAND flash cells in one string can be made of 8, 16, 32, 64, 128 or arbitrary integer number, depending on NAND specs. But by 2013, the maximum number of 3D NAND cells in one 3D NAND-string is 24 as published from Samsung's 128 Gb MLC 3D NAND product in 2013 Flash Summit held in San Jose, Calif., USA.
In summary, all the conventional NAND designs only provide one or multiple Block-based Erase and Block-Verify, but 1 WL-based ABL or HBL Page Program, Page Program-Verify and Page Read operations. But none of the existing 2D or 3D NAND designs can provide one or multiple dispersed sub-Block Erase and Erase-Verify, two or multiple dispersed WLs' and Blocks' Program, Program-Verify, Erase-verify and Read operations. Typically, in both 2D and 3D NAND designs, each NAND cell has options of several different storage types that include SLC (1 bit per physical cell), MLC (2 bits per physical cell), TLC (3 bits per physical cell), XLC (4 bits per physical cell), and even analog storage that stores more than 4 bits per one physical NAND cell. Taking the conventional 2-poly floating-gate 2D NAND array as an example, its array comprises a plurality of vertical NAND Blocks that further comprises a plurality of 2N 64T vertical NAND strings with a common source line CSL. When performing a Program operation in NAND memory, the spec indicates that it is performed in unit of one physical page or one physical WL or partial page as referred as ABL Program or one-half partial WL as referred as HBL Program. In other words, one physical WL or page can be divided into 2, 4, 8 or more logic partial WLs or partial pages or sectors in the same NAND plane.
But when NAND market demands for a higher memory density at a lower cost, more scaling of technology node below 20 nm with higher data compression of TLC are required. As a result, more severe AC BL-BL capacitance and DC cell Vt coupling effects dramatically increase the Program latency from 250 μs SLC Program to almost 3 ms TLC Program and the Read latency from 20 μs SLC Read and Verify to about 1 ms TLC Read and Verify operations. Thus, the reduction of page Program, Program-verify and Read latencies in both 2D and 3D NAND design become urgently required. Unlike prior-art program scheme, in order to more effectively and flexibly reduce the lengthy Program and Read latencies, this invention allows the Program and Read operations to be performed on more than one physical WL or page simultaneously on respective smaller Sub-segment-based and larger Segment-based or even largest HG-based Read and Program-verify operations in one NAND plane by proposing a hierarchical NAND array. Multiple physical WLs or pages concurrent Program in one NAND plane is still prohibited so far. Similarly, the concurrent multiple-WL Read and Program-Verify operations are also not allowed, regardless of 2D or 3D NAND designs by 2013 since its first debut in 1988.
Now, details of performing Program operations based on the conventional NAND are illustrated. First of all, in terms of Program unit size, the so-called All-BL (ABL) one-WL Program means that the program size is based on one full physical page or WL and the Program operation is performed in 1-cycle. Another one is named as Odd/Even page Program, which is performed in unit of two logic pages (i.e., two halves of a whole physical page). The whole physical page Program needs 2 cycles of such Odd and Even half-page (HBL) Program operations. Additionally, the Partial WL Program operation can be on Odd and Even logic pages in each divided partial WL. Each partial WL Program is performed with 1-cycle. Four partial WLs or 4 sectors would require 4 cycles of program totally. Currently, 1-cycle ABL and 2-cycle HBL program operations per one physical WL are mainstream program schemes and have their respective pros and cons. The 4-cycle or more than 4-cycle program operation has limitation with more restrictions due to the reliability concern of program page data.
The iterative page Program bias conditions include setting a selected WL to Vpgm. All gates of NAND cells in only one selected page are commonly coupled to Vpgm=15V to 25V in one selected Block using Incremental-step-pulse-programming (ISPP) Program scheme with ΔVpgm=0.15V to 0.5V for different SLC, MLC, TLC and XLC programs. This iterative Vpgm voltage is always presented to one selected WL during each ISPP program and is globally coupled directly from one selected XT's output from one selected Block-decoder without being latched locally in one selected WL's poly2 parasitic capacitor per one plane, regardless of whole or partial WL or Page Program operations in prior art.
One option of the page Program bias conditions also include setting 63 non-selected WLs (assuming each Block contains 64 pages) to Vpass along with GSL set to Vss and SSL set to Vdd or Vinh. The Vpass of all the unselected 63 pages or WLs, SSL, and GSL lines for a selected NAND Block are respectively coupled from the outputs of 63 unselected global XTs, one global SSLp and one global GSLp without being latched locally in the poly2 parasitic capacitors of 63 unselected WLs, SSL and GSL lines, regardless of whole or partial WL or Page Program operations in prior art.
This HV Vpass voltage is always presented to 63 non-selected XT bus to 63 non-selected WLs through the selected Block-decoder during each IPPG ˜250 μs SLC program time. In whole 250 μs SLC Program, the 63 non-selected WL Vpass voltage will be repeatedly coupled to 63 non-selected WLs parasitic poly capacitors during each program pulse but be switched to VRAED of ˜6V during each Program-Verify operation right after each Program operation.
Another option of the page Program bias conditions includes setting selected NAND cell's channel voltage in a select page to 0V. This 0V is coupled from one corresponding digital bit data=0 in PB (Page Buffer). The 0V is coupled to the channels of NAND cells through the selected NAND strings by coupling SSL with Vdd, then through the selected metal1 GBL lines to top PB's latch circuit for 1-metal BL NAND structure. The advantage of Program BL=0V, thus no GBL precharge current is required for the selected programmed cells.
Furthermore in prior art, the page Program bias conditions include setting channel voltages of unselected NAND cells in a selected page to a Vinhibit voltage (program-inhibit, Vinhibit≧7V). The initial Vinhibit is Vdd-Vt but finally it is boosted to a targeted value greater than 7V for automatic program-inhibit by ramping up the selected WL gate voltage from an initial Vpass voltage to a Vpgm voltage that results in a high gate coupling effect to boost the initial floating channel voltage of Vdd-Vt of about 1V at 1.8V Vdd to above 7V for the unselected NAND cells in the same selected page or WL or partial WL. This is referred as the Self-Boosting (SB) effect. In the lengthy page Program operation for a whole conventional NAND array, the Vinhibit and Vss program voltages are always presented to the channels of selected NAND cells, and Vdd and Vss are presented to GBL between the selected Block and Static PB on either array top or bottom of the selected page. Thus GBL lines are fully occupied by first NAND Program operation for a long time, during which the second Read or Program operation in the same NAND plane cannot be performed because the data contention will happen when second operation taking up the same selected GBL lines. Thus that is why a simultaneous Read and Program operation of more-than-one WLs are inhibited in the same NAND plane of the conventional NAND designs.
For a NAND Program scheme, a low current FN channel tunneling effect is used to increase NAND cell's Vt from E state (erased state) to 3 program states such as A, B or C state for a MLC storage and 7 program states for TLC program and 15 program states for XLC program. The schemes of prior art NAND's boosted Program-inhibit voltage generation include SB, LSB and EASB, etc.
In a typical SLC NAND program operation, a high step-rising program voltage, Vpgm from 15V to 25V, is applied to one selected WL but a fixed Vpass voltage of about 10V is applied to the rest of 63 non-selected WLs in the selected strings along with the gate of bottom string-select transistor connected to Vss and the gate of top string-select transistor connected to Vdd. As a result, 63 NAND cells in same string are in conduction-state while the string's bit line is grounded for Program operation. Electrons from the channels of selected NAND cells are injected into the corresponding floating-gate layers, Poly1, and NAND cell threshold voltages, Vts, are raised from an erased Vt0, E-state (with a negative value), to a first programmed state Vt1, A-state (with positive value). More information about the programming methods can be found in prior art patents such as U.S. Pat. No. 6,859,397, titled “Source Side Boosting Technique for Non-volatile Memory;” and U.S. Pat. No. 6,917,542, titled “Detecting Over Programmed Memory;” and U.S. Pat. No. 6,888,758, titled “Programming Non-Volatile Memory.”
In many cases, the Vpgm pulse is applied to a selected WLn of NAND associated with several fixed MHV pass-WL voltages such as Vpass voltages of Vpass1, Vpass2, and others applied to the non-selected WLn−1 and WLn+1 and the rest of WLs in the selected NAND strings of the selected Blocks. A series of step-rising Vpgm pulses (referred to as the programming gate pulses) are applied to the selected WL, cited as WLn, per each program iteration. Between each rising Vpgm pulse, single or multiple program-verify pulses like Read operation are performed to determine whether the selected NAND cells(s) in each selected page or WL are programmed with the desired programmed Vtn values. The programmed Vtn values are determined by the type of storages such as SLC (1-bit per cell), MLC (2-bit per cell), TLC (3-bit per cell), XLC (4-bit per cell) or analog storage (more than 4-bit per cell).
Since Program-Verify operation is like the regular Read operation, the previously mentioned precharge cycle and discharge cycle for N GBLs and 64 WLs and SSL and GSL and 64 XT lines per each Block would be the same. Therefore, during each Program-Verify cycle, all vertical GBLs lines in cell array and the set of vertical XT and SSLp and GSLp lines in X-decoder area will be fully occupied with the desired voltages and cannot be further used for other concurrent operations in the conventional NAND designs in a same plane, regardless of SLC, MLC, TLC and XLC or even analog NAND Program operations.
In summary, for the conventional NAND key operations such as Read, Program, Program-Verify, Erase-Verify, etc. can only be performed one at a time to prevent the data contention from happening in multiple XT bus lines and GBL lines. But when NAND technology is continuously scaled down below 2×nm and the density being increased to 1 Tb, the above high latency, high power consumption and low flexibility of more than two simultaneous operations in same NAND plane becomes unacceptable for those fast NAND memory system applications.
As a result, there is a strong market need to reduce Read, Verify, and Program latency, power consumption and to provide more flexibility of sub-Segment-based or Segment-based multiple-WL for respective simultaneous Program and Program-Verify/Read operations in all same and different NAND planes, regardless of 2D and 3D NAND flash designs. Nevertheless, it is preferred to have a 2-level BL universal hierarchical array structure based on which more than one preferred concurrent 2D NAND operations of Read, Program, Program-Verify, and Erase-Verify or concurrent 3D NAND operations of Read, Program, Program-Verify, and Erase-Verify and Erase in same and different NAND planes can be performed, regardless of different Erase schemes in 2D NANDs or even concurrent 2D NANDs.