Single Bit and Multi-Bit Flash Memory Cells
Flash memory devices have been known for many years. Typically, each memory cell within a flash memory device stores one bit of information. The traditional way to store a bit in a flash memory cell has been by supporting two states of the memory cell. One state represents a logical “0” and the other state represents a logical “1”.
In a flash memory cell, the two states are implemented by having a floating gate situated above the cell's channel (the area connecting the source and drain elements of the cell's transistor), and having two valid states for the amount of charge stored within the floating gate. Typically, one state is with zero charge in the floating gate and is the initial unwritten state of the cell after being erased (commonly defined to represent the “1” state) and another state is with some amount of negative charge in the floating gate (commonly defined to represent the “0” state). Having negative charge in the gate causes the threshold voltage of the cell's transistor (i.e. the voltage that has to be applied to the transistor's control gate in order to cause the transistor to conduct) to increase. Now it is possible to read the stored bit by checking the threshold voltage of the cell—if the threshold voltage is in the higher state then the bit value is “0” and if the threshold voltage is in the lower state then the bit value is “1”. Actually there is no need to accurately read the cell's threshold voltage—all that is needed is to correctly identify in which of the two states the cell is currently located. For that purpose it is enough to make a comparison against a reference voltage value that is in the middle between the two states, and thus to determine if the cell's threshold voltage is below or above this reference value.
FIG. 1A shows graphically how this works. Specifically, FIG. 1A shows the distribution of the threshold voltages of a large population of cells. Because the cells in a flash device are not exactly identical in their characteristics and behavior (due, for example, to small variations in impurities concentrations or to defects in the silicon structure), applying the same programming operation to all the cells does not cause all of the cells to have exactly the same threshold voltage. (Note that, for historical reasons, writing data to a flash memory is commonly referred to as “programming” the flash memory.) Instead, the threshold voltage is distributed similar to the way shown in FIG. 1A. Cells storing a value of “1” typically have a negative threshold voltage, such that most of the cells have a threshold voltage close to the value shown by the left peak of FIG. 1A, with some smaller numbers of cells having lower or higher threshold voltages. Similarly, cells storing a value of “0” typically have a positive threshold voltage, such that most of the cells have a threshold voltage close to the value shown by the right peak of FIG. 1A, with some smaller numbers of cells having lower or higher threshold voltages.
In recent years a new kind of flash device has appeared on the market, using a technique conventionally called “Multi Level Cells” or MLC for short. (This nomenclature is misleading, because the previous type of flash cells also has more than one level: they have two levels, as described above. Therefore, the two kinds of flash cells are referred to herein as “Single Bit Cells” (SBC) and “Multi-Bit Cells” (MBC).) The improvement brought by the MBC flash is the storing of two bits in each cell. (In principle MBC also includes the storage of more than two bits per cell, but such cells are not in the market yet at the present time. In order to simplify the explanations, the two-bit case is emphasized herein. It should however be understood the present invention is equally applicable to flash memory devices that support any number of bits per cell.) In order for a single cell to store two bits of information the cell must be able to be in one of four different states. As the cell's “state” is represented by its threshold voltage, it is clear an MBC cell should support four different valid ranges for its threshold voltage. FIG. 1B shows the threshold voltage distribution for a typical MBC cell. As expected, FIG. 1B has four peaks, each corresponding to one state. As for the SBC case, each state is actually a range and not a single number. When reading the cell's contents, all that must be guaranteed is that the range that the cell's threshold voltage is in is correctly identified. For a prior art example of an MBC flash device see U.S. Pat. No. 5,434,825 to Harari that is included by reference for all purposes as if fully set forth herein.
When encoding two bits in an MBC cell by the four states, it is common to have the left-most state in FIG. 1B (typically having a negative threshold voltage) represent the case of both bits having a value of “1”. (In the discussion below the following notation is used—the two bits of a cell are called the “lower bit” and the “upper bit”. An explicit value of the bits is written in the form [“upper bit” “lower bit”], with the lower bit value on the right. So the case of the lower bit being “0” and the upper bit being “1” is written as “10”. One must understand that the selection of this terminology and notation is arbitrary, and other names and encodings are possible). Using this notation, the left-most state represents the case of “11”. The other three states are typically assigned by the following order from left to right—“10”, “00”, “01”. One can see an example of an implementation of an MBC NAND flash device using such encoding as described above in U.S. Pat. No. 6,522,580 to Chen, which patent is incorporated by reference for all purposes as if fully set forth herein. See in particular FIG. 8 of the Chen patent. It should be noted though that there is nothing limiting about this assignment of the states, and that any other ordering can be used. When reading an MBC cell's content, the range that the cell's threshold voltage is in must be identified correctly; only in this case this cannot always be achieved by comparing to one reference voltage, and several comparisons may be necessary. For example, in the case illustrated in FIG. 1B, one way to read the lower bit is first to compare the cell's threshold voltage to a reference comparison voltage V1 and then, depending on the outcome of the comparison, to compare the cell's threshold voltage to either a zero reference comparison voltage or a reference comparison voltage V2. Another way to read the lower bit is to compare the cell's threshold voltage unconditionally to both the zero reference voltage and V2. In either case, two comparisons are needed.
MBC devices provide a great advantage of cost—using a similarly sized cell one stores two bits rather than one. However, there may also some drawbacks to using MBC flash—the average read and write times of MBC memories are longer than of SBC memories, resulting in lower performance. Also, the reliability of MBC is lower than SBC. This can easily be understood—the differences between the threshold voltage ranges in MBC are much smaller than in SBC. Thus, a disturbance in the threshold voltage (e.g. leaking of the stored charge causing a threshold voltage drift, interference from operations on neighboring cells, etc.) that may have gone unnoticed in SBC because of the large gap between the two states, might cause an MBC cell to move from one state to another, resulting in an erroneous bit. The end result is a lower quality specification of MBC cells in terms of data retention time or the endurance of the device to many write/erase cycles. Thus there may be advantages to using both MBC cells and SBC cells, depending on the application's requirements.
While the above explanations deal with floating-gate flash memory cells, there are other types of flash memory technologies. For example, in the NROM flash memory technology there is no conductive floating gate but an insulating layer trapping the electric charge. The present invention is equally applicable for all flash memory types, even though the explanations are given in the context of floating-gate technology.
Error Correction When Reading Data from Flash Cells
As explained above, flash cells, and especially MBC flash cells, may be read erroneously in case their threshold voltage drifted away from their initial value. If the amount of threshold voltage drift is large enough, the reading process may find a cell to be in the incorrect side of the reading reference voltage that is used as a border line between two states of the cell. Even though it is common to employ Error Correction Codes (ECC) for correcting errors in data read from flash memory, the correction capability is typically limited to some fixed number of errors within the page of data being read, and eventually the accumulated number of errors might exceed the correction capability of the ECC mechanism.
U.S. Pat. No. 5,657,332 by Auclair et al. entitled “SOFT ERRORS HANDLING IN EEPROM DEVICES” (herein “Auclair”) deals with this problem of flash memory errors caused by threshold voltage drifting. That patent is incorporated by reference for all purposes as if fully set forth herein. Auclair proposes two solutions to the errors problem. The first one attempts to eliminate the generation of errors by detecting cells getting close to crossing the border line, and “fixing” them by rewriting their contents back to memory, thus “resetting” the threshold voltages to their correct initial values. The second solution of Auclair accepts the existence of drifting errors as a given fact and attempts to improve the robustness of the memory system after errors are already there. This second solution is discussed in Auclair in column 13 lines 14-27.
The method of Auclair for reading data from flash memory first attempts to do a regular reading using the default value of the reading reference voltage (or multiple reading reference voltages in case of MBC flash memory). Assuming this first reading attempt results in so many errors such that the ECC mechanism fails to correct them, Auclair employs a two-stage recovery plan:                A. The reading reference voltages are changed from their default values to another set of predetermined values, and reading is attempted using the new set of predetermined reference values. Typically the new values will be somewhat lower than the default values. It is reasonable to expect the threshold voltages of the cells to drift to lower values with time (that is—move left in FIGS. 1A and 1B), as the drifting is the result of charge leakage out of the floating gate. Therefore moving the “borders” of comparison left has a good chance of separating the drifted states from each other. If there are still errors in the results of the reading with corrected reference values, they are processed by the ECC mechanism. If there are still too many errors to correct, the process repeats—another set of predetermined reading reference values is chosen and another reading and correction is done. Hopefully, this repeated process ends with data that is successfully corrected and can be assumed to have no errors. Once we get to this point we move to the second stage.        B. The data obtained from the first stage is written back to the cells, so that next time it is read using the default reading reference values, it will not provide so many errors as was the case in the current reading.        
FIG. 2 provides a flow chart describing the handling of a read request by a flash device having a controller and a flash memory (i.e. including flash memory cells) in accordance with the prior art technique disclosed in Auclair. After receiving 110 a read request, the flash controller reads 112A data bits from the flash memory 212 using default reference voltages. An attempt 114 is made to effect a correction of the read data bits using the ECC. If the error correction is successful 116, the device may respond 118 to the read request by sending the corrected read data (for example, by sending the data to the host device). After responding 118 to the read request, the device is ready to handle another read request.
If the error correction is not successful 116, the device re-reads 112B data bits from the flash memory using a set of reference voltages including at least one pre-determined modified 119 reference voltages. After re-reading 112B the data bits with one or more “new” pre-determined reference voltages (i.e. using at least pre-determined modified 119 reference voltage), another error correction is attempted 114. In the event of another error correction 116 failure, this process of using different pre-modified 119 reference voltages and re-reading 112B data bits is repeated until the ECC can successfully perform 116 an error correction of the data bits.
At that point, the memory cells are “rejuvenated” in order to reduce the likelihood of error correction failure when the cells are subsequently read (with their “proper” default reference voltages). This is done by taking advantage of the fact after successful error correction, the proper data is now available, and may be re-written 124 into the memory cells. Thus, according to the teaching of Auclair, it is assumed that after data re-write, the next time the data is read 112A using the default reading reference values (i.e. after another read request 110), the flash memory cells will be more likely to provide data with fewer errors (i.e. because the reference voltage “drift” has been corrected), providing read data that may be corrected with the ECC.
It is noted that the aforementioned recovery method of Auclair suffers from a big disadvantage. Every time the recovery process is employed the data is written again. In flash memory the writing operation is much slower than the reading operation. For example in SBC NAND flash memory the writing of a page of data takes approximately 200 microseconds, while the reading of a page of data takes approximately 15 microseconds. The situation is even worse in MBC NAND flash memory, where the writing of a page may take 800 microseconds while the reading of a page may take 30 microseconds. This fact means that employing the method of Auclair for recovering a page of data may be a very slow operation. Typically the software application initiating the read request and waiting for the data expects the data to be available within tens of microseconds, while it might actually have to wait an order of magnitude longer. For real-time software applications, this might be unacceptable. Even if the writing stage of Auclair is delayed to a later time, so that the software application receives the data as soon as it is available and without waiting for the recovery process to complete, there is still a degradation in the throughput of the storage system due to the extra writing operation.
There is thus a need for methods that recover data from flash memory in the presence of errors, while achieving the recovery in a relatively short time.