The present invention relates to flash memories and more particularly, to flash memories that are robust against partial erasure in the event of power loss.
Flash memory management systems (such as those disclosed in U.S. Pat. Nos. 5,404,485 and 5,937,425, both of which are incorporated by reference for all purposes as if fully set forth herein) must keep their integrity even after unexpected power loss which might occur at any point in time. The difficulty with achieving this is that such a power loss might occur in the middle of any operation performed by the flash device or in the middle of one of the software routines that manage the data structures essential for maintaining the coherent interpretation of the flash data contents.
One of the most important cases which must be considered in this respect is a power loss occurring while the flash device is carrying out an erase operation. Flash devices are memory devices in which data cannot be written unless the address to write into is in an “erased” state. Additionally, erasing cannot typically be done for individual addresses, but must be done for groups of a relatively large size of addresses (typical values are from a few kilobytes to a few hundred kilobytes). The chunk of flash media that is erased in one operation is called herein a “unit”. It should be noted that the size of a unit may be a feature of the flash device hardware (being the smallest chunk of media addressed in an erase command to the hardware), or a feature of the flash management software which may combine a few hardware-based units into one larger logical unit always to be handled and erased as a single entity.
Such an erase operation is quite a long process (between a few milliseconds and a few seconds, depending on the flash type), and therefore the possibility of the power loss hitting while an erase operation is in progress cannot be ignored. The danger in having such interrupted erase operation is that when the flash device loses its power sources, it might be in a state in which it is partially erased—some of the bits in the erased unit might have already been brought to their target erased state, while others might still be in a non-erased state. Unless such a condition is detected and taken care of properly, the unit might be considered fully erased next time it is needed for use, but the next programming operation will not produce the desired results. Moreover, because of the way some flash devices perform their internal programming verification, such an effect of programming errors is not detected by the verification mechanism. This is so because the programming verification usually (in NAND flash) only detects failures in bringing an erased bit to a non-erased state, but the above effect results in a bit being at an non-erased state instead of in an erased state, which goes unnoticed.
Additionally, even if all bits in the erased unit currently read as erased (that is—the unit looks fully erased), the power loss might have caused the erasure to be marginal and less reliable, so that in the long run the data in this unit will have degraded retention capability resulting in accumulated errors. Therefore it is highly preferable not to rely on a unit whose erasure was interrupted by a power loss, but instead to erase the unit again before actually using it. Note that this last reason explains why even the trivial (but highly inefficient) method, of always reading the full unit contents before using the unit for verifying that the unit is fully erased, is not a good enough solution.
The above problem is well known in the prior art and there are software solutions to it. One simple and common solution is to use an “erase mark” for detecting the interrupted erasure case. According to this solution, after performing any erase command, the software always writes a special signature (the “erase mark”) into a pre-defined location in the erased unit. Also, before performing any erase operation the erase mark of the unit is overwritten to destroy the erase mark. Additionally, whenever a free unit is allocated for use, the first step is to check for the existence of the unit's erase mark. If the unit completed erasing normally, then the erase mark is there. But if the last erase operation on this unit was interrupted by a power loss, no erase mark exists. Therefore the software can determine whether the newly allocated unit was reliably erased, and if not—erase it again. This solution is in use for quite a few years, for example in the TrueFFS family of flash disk drivers offered by M-Systems Flash Disk Pioneers Ltd. of Kfar Saba, Israel.
This solution requires the flash management system to make more than one write operation into some of the flash pages before having to erase them for further writing (as understood herein, a flash page is the smallest chunk of data that can be written in one operation into the physical media, with one or more pages in a unit). This is so because the erase mark is first written, and then at a later stage user data is written into that same page. Additional write operations might occur for destroying the erase mark, and for other steps taken by the flash management system for supporting its control algorithms.
Most flash memory devices in use today support such a capability of multiple writing (known in the technical flash literature as Partial Page Programming or PPP for short). Typical PPP values currently range from 3 to 10. Recently, however, a few major flash memory device vendors announced that some of their forthcoming flash memory devices will no longer support PPP capability greater than one, which means it will not be allowed to write twice to the same page without first erasing it. This restriction makes the prior art methods for detecting potentially partially-erased flash units unusable.
There is thus a widely recognized need for, and it would be highly advantageous to have a method of detecting incomplete erasure of a PPP=1 flash memory.