This invention relates generally to memory architecture for improving NAND Read operation. More specifically, various embodiments of the present invention provide a NAND Read scheme for drastic reduction over the high WL-disturbance, high BL-precharge power consumption, and high latency issues in an extremely high-density NAND (HiNAND) memory array, regardless of data storage types such as SLC, MLC, TLC, XLC or Analog storages, regardless of 2-poly floating-gate NAND cell or 1-poly charge-trapping MONOS or SONOS cell, regardless of NMOS cell or PMOS cell, regardless of 2D or 3D NAND technologies.
Nonvolatile memory (NVM) is well known in the art. NVMs that provide the in-system or in-circuit repeatedly electrically programmable and erasable functions today include three major standalone NVMs such as EEPROM, NOR, and NAND Flash Memory and one Embedded Flash, eFlash, memory. All above four NVMs are based on varied technologies.
The EEPROM is suitable for the Byte-alterable Data storage with the highest density below 4 Mb at 0.13 μm node. The NOR flash is suitable for the block-alterable Code storage with the highest density below 8 Gb at 45 nm node. The eFlash is suitable for the page-alterable Code storages with the highest density below 64 Mb at 65 nm node. Lastly, NAND flash is suitable for the sector-alterable Data storage with the highest density below 256 Gb at 19 nm node in MLC storage.
Currently, NAND flash memory has achieved the highest scalability, density and smallest feature of 1× nm node in 2012. The mainstream standalone NAND in mass production is mainly based on 2-poly floating-gate NMOS device, which employs 20V but the extremely low current FN channel-erase and FN channel-program schemes. The NAND flash device and cell array comprises a plurality of NAND Strings that are organized in a matrix with a plurality of rows and columns. Each String is further comprised of a plurality of NMOS NAND cells connected in series sandwiched by two NMOS 1-poly String-select transistors located on top and bottom of a NAND String. The number of NAND flash cells in one String can be made of 8, 16, 32, 64, 128 or arbitrary integer number, depending on NAND density requirement and applications.
Each NAND cell has several different types of storages that include SLC (1 bit per cell), MLC (2 bits per cell), TLC (3 bits per cell), XLC (4 bits per cell) and even Analog storage that stores more than 4 bits.
In three key operations of prior-art NAND such as Erase, Program and Read, of which require either verification in both Erase and Program operations or Read to ensure the desired data or NAND cell's threshold state Vt have been accurately programmed and erased into the selected NAND flash cells in the right locations in accordance with the desired values, storage types and time specs by checking or reading out the selected cells' Vt after Erase and Program operations.
In the specification, a 2-poly NAND array comprising a plurality of 32T NAND-Strings is assumed and used as an example to describe both the conventional NAND Read operation and a HiNAND Read operation based on the present invention, although other String sizes (8, 16, 64, 128, etc.) can be applied. When programming NAND cells in the same selected WL, FN-channel scheme is commonly used for Erase and Program operations. In a typical NAND Program operation, a high step-rising program voltage, Vpgm from 15V to 25V, is applied to one selected WLn but a Vpass(program) of around 10V is applied to the rest of 31 non-selected WLs in the selected Strings along with the gate of bottom String-select transistor connected to Vss and the gate of top String-select transistor connected to Vdd.
As a result, 31 NAND cells in the same String are in conduction-state while the String's bit line is grounded. The plurality of electrons from the selected NAND cells' channels are injected into the floating gate layer, Poly1, and NAND cells' threshold voltage, Vt, are raised from an erased Vt0, E-state, with a negative value to a desired positive value of Vt1, which is referred as a programmed state, A-state.
Similarly, when programming a 1-poly NAND flash device using same FN-channel scheme, a similar step-rising high program voltage, Vpgm, is also applied to the selected 1-poly NAND's control gate with its bit line is grounded. This Vpgm voltage can be same or lower than the Vpgm applied to the above 2-poly NAND flash memory, depending on the coupling ratio from Poly gate to Nitride layer. Electrons from 1-poly NAND cell's channel are then injected into its charge-trapping layer. The step-rising Vpgm voltage is typically set to be between 15V to 25V with a typical increment of ΔVpgm, ranging from 0.2V to 0.4V.
More information about the programming methods can be found in U.S. Pat. No. 6,859,397, titled “Source Side Boosting Technique for Non-volatile Memory;” and U.S. Pat. No. 6,917,542, titled “Detecting Over Programmed Memory;” and U.S. Pat. No. 6,888,758, titled “Programming Non-Volatile Memory,” which are incorporated as references herein.
In many cases, Vpgm pulse is applied to the selected WLn of NAND associated with several MHV pass-WL voltages such as Vpass(program) voltages such as Vpass1, Vpass2 and others applied to the non-selected WLn−1 and WLn+1 and the rest of WLn in the selected NAND strings of the selected blocks.
A series of Vpgm pulses (referred to as the programming gate pulses), with the magnitude of the pulses increasing are applied to WLn. Between each rising-step Vpgm pulse, a set of single or multiple program verify pulses like Read operation are performed to determine whether the selected NAND cells(s) in the selected page or WL are being programmed into the desired programmed Vtn values. The programmed Vtn values are determined by the type of storages such as SLC (1-bit per cell), MLC (2-bit per cell), TLC (3-bit per cell), XLC (4-bit per cell) or Analog storage (more than 4-bit per cell).
Since Program-Verify operation is like the regular Read operation, the previously mentioned BL-precharge cycle and discharge cycle would be the same. Therefore, during each Program-Verify cycle, a NAND flash memory has to precharge all long BLs' large capacitance from Vss to VBL as described before. As a result, a large BL precharge current occurs and the large Vpass(read) 6V WL disturbance will be induced on NAND cell. In addition, Program-Verify cycle also has a long latency as Read due to the discharge process starts from a high value of VBL, which ranges from 0.8V to Vdd in today's NAND design.
If any of the selected NAND cells have reached their targeted programmed Vts as determined in Program-Verify step, then the further programs have to be stopped on those NAND cells to avoid or inhibit over-programming into next higher wrong Vt state. For those NAND cells' Vt that do not reach the desired value after Program-Verify operation, then the Vpgm pulses continue applying to those NAND cells in the selected page or WL associated with Vpass of 10V or other HV to the non-selected WLs. If the desired Vts are not reached, then the iterative programming and verify pulses would be repeatedly applied to those cells. Until all NAND cells in the selected page have been programmed successfully into the desired Vt states, then the programming and Program-Verify operations of the selected page would be stopped. The Program and Program-Verify operations would be continued on those remaining pages in the preferred sequence from String bottom to the string top in the selected strings of the selected blocks of the NAND memory. As the Program and Program-Verify operation repeats, the BL precharge current and Vpass WL-induced disturbance will be multiplied.
Typically, each NAND String physically comprises 16, 32, 64, or even 128 WLs and the MLC page number is doubled to SLC page number, TLC density is tripled and XLC density is quadrupled.
A multi-state NAND memory device stores multiple bits of data per NAND cell by differentiating multiple distinct valid Vtn distributions separated by some preferred forbidden ranges such as ΔVtn. Each distinct Vtn has a distribution between Vtnmax and Vtnmin. Each ΔVtn is defined to be a value of Vtnmin of a higher-level state minus the Vtnmax of a lower-level Vtn state. Each Vtn is defined corresponding to a predetermined value for the set of data bits encoded in NAND device.
As the number of bits of data per NAND cell is increased from SLC to MLC, TLC, and XLC, the number of valid Vtn are increased from 2 to 4, 8 and 16. As a result, the NAND data capacity is drastically increased, thus the die cost is greatly reduced.
There is a tradeoff. When each NAND cell storage capacity is programmed to increase, however, the programming time also increases and NAND cell's data reliability greatly degrades accordingly. In some applications, the increased programming time and the lower data reliability cannot be accepted.
But for this invention, we will look into more into the NAND Read and Program-Verify disturbance and precharge BL current consumption issues.
Why does a regular NAND Read operation consume so much power? This is due to the fundamentals of the NAND sensing scheme. Firstly, to read or verify a stored Vt out from a selected NAND cell in the selected NAND string and in a selected page (WLn), today's NAND scheme pre-charges each BL from Vss to a value of VBL first because the resistance of the NAND string is larger than 1 M-ohms if the sensing current is less than 1 μA. Later, to distinguish the NAND cell's Vt, it is determined by reading the final VBL voltage against the VRD in the selected WL. If VBL is discharged to Vss, then NAND cell Vt=VRD. If VBL is not discharged and retains the initial pre-charged VBL, then the NAND cell's Vt is higher than VRD applied on the selected WL. The Read operation of getting the right NAND Vt has to be continued until a new value of VRD is found to discharge the initial VBL to Vss.
In prior-art NAND array architecture, each BL is connected to a plurality of NAND Strings by a long metal line such as metal1 line running in Y-direction with a capacitance value of around 3-5 pf. The VBL value ranges from 0.8V to Vdd. With a lower VBL value, less number of BLs and a shorter BL metal1 line length would reduce each BL and total BL capacitances, thus reducing total each and total BL pre-charge current in NAND Read and Verification cycles.
In one option of the conventional NAND array, the whole physical BLs in one physical page of the NAND array are being commonly divided into two sub-pages such as the odd page and the even page with half of whole BLs. In another option of the conventional NAND, the whole BLs in NAND array are read out simultaneously without being divided into the odd and even pages for less WL-disturbance and it was referred as All-BL Read or All-BL Verify. The total numbers of BLs in each NAND's physical page today can be as high as 8 KB or 65,536, not including the spare BLs for ECC syndrome bytes storage. For example, one NAND page size of 512B requiring extra 16B to store the regular NAND data and ECC spared data respectively. Each BL is connected to a plurality of NAND Strings by a metal, such as metal1 line, with a capacitance value around 3-5 pf, in NAND density above 16 Gb. In a NAND technology node above 4× nm, the BL capacitance mainly attribute to two factors. The first factor is the area of BL metal1 overlaps the flash cells and Strings that are formed in a triple P-well within the Deep N-well on top of P-substrate. The second factor is the N+/P junction capacitance occurs at each NAND String's contact areas. Since only one contact is shared by two long NAND Strings in Y-direction, the number of N+/P Contacts and its capacitance is much smaller than the BL-long metal1 line. Thus, N+/P capacitance is negligible as compared to BL metal1.
When scaling down below 3× nm node, the BL metal1 proximity effect in NAND flash array layout will induce a significant parasitic capacitance between two adjacent BLs. As a result in nGb-density NAND array, each BL will bear a long metal1 BL with a large capacitance comprising over metal1 overlapping area to flash cells on p-substrate and coupling parasitic BL capacitance due to the BL-proximity effect.
During a NAND Read or Program- and Erase-Verification operations, all BLs or one-half of BLs such as odd and even BLs have to be pre-charged from Vss to a desired high value, ranging from 0.8V to Vdd. This precharge operation is done on maximum 8 KB BLs as defined in one physical page. The total precharge current=N×Vdd×CAPBL/Pre-charge time, where N is the number of total BLs. The value of N is 65,536 for 8 KB page and CAPBL is 3 pf. The typical BL precharge time was designed to be around 10 μS today. As a result, a total peak precharge BL current of more than 100 mA will occur in each Read or each Verification cycle. This huge GBL pre-charge current needs to be reduced for a longer storage for battery-driven handheld mobile NAND.
Secondly, a Read-induced or Verify-induced WL-disturbance in NAND flash memory is also frequently encountered during the regular Read and Verify operations and becomes worsen after repeated Read and Verify operations in prior-art NAND flash memory. Just a repeated reading and verification of NAND can quickly corrupt the stored data in each NAND flash cell. Although this corrupted data can be fixed by using a more sophisticated ECC algorithm of flash controller, NAND flash memory will be rendered useless when the Read-induced and Verify-induced errors are too high beyond ECC's fixing capability.
Why a regular Read and Verify operations has induced severe WL-disturbance issue in prior-art NAND flash memory? It is again due to NAND's unique structure in each NAND String. To read the selected cell of each String, the non-selected NAND cells with different stored Vts have to be all in the conduction state. To ensure all of these 31 non-selected NAND cells in a 32T NAND string, the 31 non-selected WLs have to be coupled with a MHV Vpass(read) voltage of around 6.0V, which is higher than the maximum stored Vt of around 4.0V. This Vpass(read) voltage on top gate would couple to the poly1 floating gate of nitride charge layer to attract the electronics from NAND cell's channel. This is like a soft-writing of the NAND cell. As a result, each NAND's Vt would be gradually increased when Read cycles increases.
Take the Read operation of a NAND array comprising a plurality of 32T-strings as an example. The only one selected WL voltage is coupled with one or several preferred VRD voltages but the rest of 31 non-selected WLs and two top and bottom string-select NMOS transistors are coupled with a MHV voltage, Vpass(read). MHV stands for a medium high voltage around 6.0V.
For example, a SLC Read and Verify operations, only one VRD value is needed, VRD=0V, to distinguish the erased state of a negative Vt0≦−0.7V with data “1” from the programmed positive Vt1≧1.0V with a data “0”. In this SLC Read operation, only one BL-precharge cycle is required, thus one huge BL precharge current happens.
For a MLC Read operation, three VRD values of VRD1=0V, VRD2=1.5V, VRD3=3.0V are required to distinguished one erased state and three program states such as Vt0 (E-state, 11), Vt1 (A-state, 01), Vt2 (B-state, 00) and Vt3 (C-state, 10), where Vt0<Vt1<Vt2<Vt3. In this MLC Read operation, still only one BL-precharge cycles is required by using the step-rise three VRD voltages to read four logic data with stored Vts. In other words, a MLC Read operation consumes one BL pre-charge current as a SLC Read operation.
Similarly, for a TLC Read operation, there are seven VRD values to distinguish one erased negative-Vt state, Vt0 from the remaining seven programmed positive-Vt states such as Vt1, Vt2, Vt3, Vt4, Vt5, Vt6, and Vt7. A similar step-rise VRD can be applied to the selected WL but still only one BL-precharge current happens in whole TLC Read cycle.
Additionally, for a XLC Read operation, there are fifteen VRD values to distinguish one erased negative-Vt state, Vt0 and remaining fifteen programmed positive-Vt states such as Vt1, Vt2, Vt3, Vt4, Vt5, Vt6, Vt7, Vt8, Vt9, Vt10, Vt11, Vt12, Vt13, Vt14, and Vt15. A similar step-rise VRD can be applied to the selected WL but still only one BL-precharge current happens in whole XLC Read cycle.
All non-selected 31 WLs and two String-select transistors of the selected NAND Strings are applied with a MHV voltage, Vpass(read). The reason is to turn these 31 NAND flash cells into a conduction state to allow the accurate Vt differentiation of one selected NAND cell in the selected WL. Since the maximum program Vtn is set to be around 4.0V, thus Vpass(read) 6.0V is required to turn the maximum-Vt NAND cells into the conduction state for a Read latency spec typically set to be 20 μs. Additionally, the reason to couple the identical Vpass(read) voltage to the top and bottom String-select transistors' gates is to further reduce their turn-on resistance so that a faster BL discharge speed, thus faster Read speed can be achieved.
Although the Vpass(read) of 6.0V in NAND Read operation is lower than the Vpass(program) of 10V, these 31 non-selected NAND cells would suffer a soft programming because the 6.0V gate will be coupled to floating-gate layer. The positive floating-gate voltage will attract the electrons in NAND's channel to inject across NAND thin tunnel oxide layer.
As a result in the repeated NAND Reading, more and more electrons would be gradually injected into all 32 NAND cells' floating-gate in the selected NAND String. Thus, the Vts of 32 NAND cells after each NAND read would be increased unintentionally.
The most severe soft-writing due to Vpass(read) would happen to those NAND cells with the lowest Vt states such as the E-state with Vt≦−0.7V and the A-state=0.7V.
When scaling below 3× nm, the BL proximity effect will result in a severe BL coupling noise to those NAND read operation with a BL-precharge followed a BL-discharge scheme. Since after BL-precharge period, the BL pull-up device is being shut off. In subsequent Read after precharge cycle, the discharged BLs will couple to those adjacent non-discharged BLs to ground, thus it will result in the fault reading. In order to effectively prevent this adjacent BL coupling noise, the conventional NAND's each physical page is divided into two sub-pages such as odd and even sub-pages. During the odd page read, one option of the BLs' voltage of even page are precharged to a VBL voltage for avoiding the Vpass(read) WL-induced disturbance due to the WL-coupling self-boosting effect in the NAND cell channels of the even page but at the expense of consuming one ½-page BL precharge current. The second option of the BLs' voltage of even page are reset to Vss for avoiding consumption of BL precharge current on even page but at the expense of suffering one ½-page Vpass(read) WL-induced disturbance.
For each Read operation, a predetermined the VRD is applied to the selected WL and the a WL-pass voltage, Vpass, ranging from 5-7V is applied to the unselected N−1 WLs to turn the N−1 NAND cells into the conduction state so that the On or Off state of the selected NAND cell can be accurately distinguished. The single VRD value of 0V is for SLC reading, but three distinct VRD values of 0V, 1.5V and 3V are for MLC reading and seven distinct VRD values are for TLC reading and 15 distinct VRD values are for XLC read.
Lastly, one of the drawbacks of the prior-art NAND, the SLC page-read latency averagely is about 20 μs which is too slow as compared to today's fast random NOR read latency of 100 ns for a Gb-density. Since each Read operation is from a NAND String, all the non-selected cells in the non-selected WLs or pages suffer one time of Vpass WL disturbance. MLC will suffer 3 times longer delay of about 60 μs, TLC will suffer 7 times longer delay of about 140 μs and XLC will suffer 15 times longer delay of about 300 μs. As a result, the Vpass WL disturbance becomes more severe issue in NAND memory with higher storage compression. In addition, each Read of NAND programmed states of A, B and C would consume one high BLn precharge current.
In summary, BL pre-charge operation consumes too much power in prior-art NAND Read operation, regardless of SLC, MLC, TLC and XLC, regardless of ALL BL Read or odd and even BL Read schemes.
Furthermore, in today's NAND flash market demand, more MLC, TLC and XLC storages than SLC are strongly required to further reduce the die cost by ½, ⅓, or even ¼. A larger P/E endurance cycles or a longer lifespan for a less Read disturbance and less Read latency for a superior performance are also very important.
Thus an urgent need to reduce the Read disturbances, power-consumption as well as Read latency in the conventional NAND strings and array is strongly required.