The present invention relates to a method of managing a flash memory and, more particularly, to a method, of managing a multi-level cell flash memory, that is resistant to data corruption when power is interrupted unexpectedly.
Flash memory is a form of EEPROM (electronically erasable programmable read-only memory) non-volatile memory. FIG. 1A is a high level schematic block diagram of a generic flash-based data storage device 10 that is used by a host device (not shown) for storing data in one or more NAND flash media 12. The operation of device 10 is controlled by a microprocessor-based controller 14 with the help of a random access memory (RAM) 16 and an auxiliary non-volatile memory 18. For this purpose, flash device 10 and the host system communicate via a communication port 20 in flash device 10. Typically, for backwards compatibility with host devices whose operating systems are oriented towards block memory devices such as magnetic hard disks, flash device 10 emulates a block memory device, using firmware stored in auxiliary non-volatile memory 18 that implements flash management methods such as those taught by Ban in U.S. Pat. No. 5,404,485 and U.S. Pat. No. 5,937,425, both of which patents are incorporated by reference for all purposes as if fully set forth herein. The components of device 10 are housed together in a common housing 15.
Other devices that use NAND flash media to store data are known. FIG. 1B shows a personal computer 10′ in which NAND flash media 12 are used in addition to, or as a substitute for, a magnetic hard disk for long-term non-volatile data storage. Controller 14 now represents the central processing unit of personal computer 10′. Auxiliary non-volatile memory 18 now represents all of the other non-volatile memories of personal computer 10′, including a BIOS in which boot code is stored and a magnetic hard disk for storing the operating system, including the flash management system, of personal computer 10′ (unless NAND flash media 12 are a substitute for a magnetic hard disk, in which case the operating system is stored in NAND flash media 12). NAND flash media 12, controller 14, RAM 16, auxiliary non-volatile memory 18 and other components (not shown) of personal computer 10′ communicate with each other via a bus 19. In some configurations of personal computer 10′, NAND flash media 12 are on a removable card. In other configurations of personal computer 10′, the illustrated components are integrated in a single unitary physical device, so that NAND flash media 12 are not a physically separate entity.
The operations that controller 14 performs on NAND flash media 12 include read operations, write operations and erase operations. NAND Flash media 12 typically are written in units called “pages”, each of which typically includes between 512 bytes and 2048 bytes, and typically are erased in units called “blocks”, each of which typically includes between 16 and 64 pages. Note that the use of the word “block” to refer to the erasable units of NAND flash media 12 should not be confused with the use of the word “block” in the term “block memory device”. The “block” nature of a block memory device refers to the fact that the device driver exports an interface that exchanges data only in units that are integral multiples of a fixed-size unit that typically is called a “sector”.
To facilitate the management of NAND flash media 12, controller 14 assigns each page a status of “unwritten” or “written”. A page whose status is “unwritten” is a page that has not been written since the last time it was erased, and so is available for writing. A page whose status is “written” is a page to which data have been written and not yet erased. In some embodiments of device 10, controller 14 also assigns some pages a status of “deleted”. A page whose status is “deleted” is a page that contains invalid (typically superseded or out of date) data. In embodiments of device 10 that support “deleted” pages, the “written” status is reserved for pages that contain valid data. Herein, a page whose status is “unwritten” is called an “unwritten page”, a page whose status is “written” is called a “written page” and a page whose status is “deleted” is called a “deleted page”.
Because device 10 is used for non-volatile data storage, it is vital that device 10 retain the data written thereto under all circumstances. A major risk to the integrity of data stored in device 10 is a sudden power failure in which the power source to device 10 is interrupted with no prior notice while device 10 is in the middle of an operation. Often such a power failure causes the interrupted operation to have erratic or unpredictable results.
If the power failure occurs while device 10 is in the middle of an operation that changes the contents of NAND flash media 12, for example in the middle of writing a page of data or in the middle of erasing a block, the contents of the interrupted page or block are unpredictable after device 10 has been powered up again and indeed may be random. This is because some of the affected bits may have gotten to the state assigned to them by the operation by the time power was interrupted, while other bits were lagging behind and not yet at their target values. Furthermore, some bits might be caught in intermediate states, and thus be in an unreliable mode in which reading these bits will return different results in different read operations.
This problem is handled well by many prior art flash management software systems, for example the TrueFFS™ flash management system used by M-Systems Flash Disk Pioneers Ltd. of Kfar Saba, Israel. The reason that these prior art systems can defend against this problem is that the data corruption is localized to the page or block being modified when the power failure occurs. All the other pages in NAND flash media 12 keep their contents and do not become corrupted. Therefore, in the case of an interrupted write operation, prior art flash management systems can assume the validity of all other pages and concentrate on the last page written. There are several approaches that can be used.
One approach is to store a pointer, to the page to be written or to the block to be erased, in a predetermined location before the operation, so that when device 10 powers up again, controller 14 can look up this pointer and immediately know which page or block was the last one targeted. This method usually uses one or more validity flags that signal to controller 14 whether the operation completed successfully. See for example U.S. Pat. No. 6,977,847, which is incorporated by reference for all purposes as if fully set forth herein. That patent application teaches an example of such a method for protecting against power loss during erasing.
Another approach is to limit the locations where data may be written at any given time to only a subset of the pages. Controller 14 then can consider all such locations as potentially corrupt, and can avoid using the data stored therein upon powering up. Alternatively, controller 14 can subject the data to a “validity test” before trusting them as not corrupted. An example of a flash management method to which this approach can be applied is taught in U.S. Pat. No. 6,678,785, which is incorporated by reference for all purposes as if fully set forth herein. According to U.S. Pat. No. 6,678,785, the writing algorithm is limited to writing new pages in sequential order within each block. Therefore, on power up it is known that the last page written in any given block was the highest numbered written page in that block.
Other systems are not amenable to such shortcuts, and a brute force method of identifying suspect pages might be used. Nevertheless, the handling of the power-loss data corruption problem is made relatively easy by the knowledge that only the data in the last written page might have been corrupted and that the data stored in all the other pages are reliable.
It should be pointed out that the above discussion applies to the validity of pages as stand-alone entities. It is another question altogether whether the system as a whole is valid even if no page write was interrupted Such problems can occur, for example, in file systems in which a user-level operation consists of several page-level operations. For example, the creation of a new file involves writing a directory entry, writing one or more sector allocation tables and only then writing the actual file data. If only sonic of these write operations are completed by the time power fails, while the remaining write operations have yet to begin, then no page is corrupted but the file system as a whole is corrupted Methods for protecting against such problems are known (see for example co-pending U.S. patent application Ser. No. 10/397,378) but are beyond the scope of the present invention.
Recently, NAND flash media 12 have come into use for which the above assumptions about the locality of data corruption upon power loss are not valid. Examples of such NAND flash media 12 include the Multi-Level Cell (MLC) NAND flash devices of Toshiba (e.g. the TC58DVG04B1FT00). In such devices, each cell stores two bits rather than one bit. The internal arrangement of that device is such that a physical page resides within a group of 528×8=4224 cells. But while in other devices such a group of cells stores one page of 528 bytes, in the Toshiba MLC NAND flash devices such a group of cells stores two such pages of data. Such a group of cells, that stores two or more pages of data, is called a “superpage” herein.
FIG. 2 is a schematic illustration of a block 30 of one such MLC NAND flash device. Block 30 includes 64 pages 32, with respective logical addresses 0 through 63, in 32 superpages 34. The logical addresses of pages 32 are shown in FIG. 2 in a column on the left side of block 30.
Now consider the following sequence of events:
1. One of the pages 32 of a two-page superpage 34 is written successfully, with the other page not being written.
2. A write to the other page 32 of the two-page superpage 34 is interrupted by a power loss.
Because the two pages 32 of this superpage 34 share the same physical cells, the power loss could corrupt both pages 32. To understand how both pages 32 could be corrupted it is necessary to consider how bits are encoded within an MLC flash cell. One method of encoding bits in MLC flash cells is taught by Harari in U.S. Pat. No. 5,095,344 and in U.S. Pat. No. 5,043,940. According to this method, bits are encoded in a MLC flash cell by injecting different amounts of electrical charge into the floating gate of a flash cell, thereby producing different levels of a threshold voltage VT1 of the cell. The following table shows the values of the two bits stored in the cell as a function of threshold voltage:
VT1Value of bit 1Value of bit 2−3.0 V11−0.5 V10+2.0 V01+4.5 V00
In practice, the four possible bit combinations of a two-bit flash cell are stored as four different threshold voltage ranges. In the above example, the threshold voltage ranges are +3.25V to +5.75V for (0,0), +0.75V to +3.25V for (0,1), −1.75V to +0.75V for (1,0) and −4.25V to −1.75V for (1,1). Because changing either one of the two bits involves changing the same physical attribute (i.e., the threshold voltage) of the cell, it is clear that the process of changing one bit shifts the other bit from its previously stable state. If the change does not complete correctly, it might result in a wrong interpretation for the value of either or both bits.
The true difficulty in defending against this problem arises because the two pages 32 of a superpage 34 might be written at two different, widely separated times. Conceivably, a first page 32 of a superpage 34, that was written at a certain time, could be corrupted many years later by an incomplete write to the second page 32 of that superpage 34. Moreover, most file systems that sit on top of flash management systems may allocate pages 32 to files either contiguously or noncontiguously, so that pages 32 of the same superpage 34 could belong to totally unrelated files. A power loss during the update of one file could corrupt a totally unrelated file that would not be suspected of being at risk. Obviously, these conditions are beyond the capability of prior art flash management systems to deal with.
U.S. Pat. No. 6,988,175, which is incorporated by reference for all purposes as if fully set forth herein, solves this problem of power interruptions by adopting a policy for storing incoming data only into pages whose writing does not put other unrelated previously written data in other pages at risk. The methods of U.S. Pat. No. 6,988,175 are based on defining “risk zones” of pages whose data could be corrupted by interrupted writes. When one or more pages are selected for writing new data, the risk zone(s) of the page(s) selected for that write operation is/are checked to see if any of the other pages in that/those risk zone(s) might be storing valid data, i.e., if the status of any of the other pages in that/those risk zone(s) is “written”. If any of the other pages in that/those risk zone(s) might in fact be storing valid data, then the selected page(s) is/are not written. Instead, the flash management system seeks a different page or pages for the write operation.
The risk zone of a page is defined in U.S. Pat. No. 6,988,175 as the set of other pages whose data are placed at risk of corruption when the page is written. For example, in FIG. 2, the risk zone of each page 32 is the other page 32 of that page 32's superpage 34. When one or more unwritten pages are selected for writing, the selected page or pages are written only if there are no written pages in any of their risk zones.
If the data to be written span more than one page, the targeted pages may be written either sequentially or in a random order. “Sequential” writing means that the pages of a block are written only in increasing logical address order, as in U.S. Pat. No. 6,678,785. “Random” writing means that the pages of a block may be written in any logical address order. The methods of both U.S. Pat. No. 6,988,175 and the present invention are equally applicable to both cases.
While the methods of U.S. Pat. No. 6,988,175 provide a solution to the problem of data corruption as a result of power interruption, they have two main disadvantages. The first disadvantage is that by avoiding writing into pages that are within the risk zones of previously written pages, we must skip those pages and leave them unused. This creates “holes” within the physical address space of the flash memory, where there are unused pages surrounded by written pages. For example in the case of FIG. 2 (and assuming sequential writing), after the user first has written page 0 the next data to be written into that block are directed into page 2, leaving page 1 unused. Therefore page 1 becomes a “hole” between pages 0 and 2. The creation of holes during the writing of data into the flash memory wastes valuable space and complicates the flash management software that must be ready to encounter these holes on reading and avoid interpreting these holes as containing valid data.
A second disadvantage of the methods of U.S. Pat. No. 6,988,175 is their relative inefficiency in handling flash devices in which the arrangement of the risk zones is not as symmetric as in FIG. 2. In FIG. 2 the risk zone of page number 20 is page number 21, and the risk zone of page number 21 is page number 20. Thus, we can view the pages as if they are divided into disjoint groups, where members of a group may risk each other, but they never risk pages outside their group. Pages 20 and 21 constitute one such group, and it is really the case that neither of them risks any other page outside their group. There are however flash memory devices where this is not the case—there are no “boundaries” which stop the “propagation of risk”, like the boundary we have in FIG. 2 between pages 21 and 22. In those devices, every page puts at risk at least the page following it, and some pages even place at risk additional pages having higher addresses. Such complex risk zones structures may be created when a multi-level cell flash device implements techniques for reducing or eliminating interference between adjacent word-lines of its array of flash cells, where such techniques affect the writing order of the pages. An example for such techniques is disclosed by Chen et al. in U.S. Pat. No. 6,522,580 entitled “Operating Techniques For Reducing Effects Of Coupling Between Storage Elements Of a Non-Volatile Memory Operated in Multiple Data States”, which patent is incorporated by reference for all purposes as if fully set forth herein. Applying the methods of U.S. Pat. No. 6,988,175 to such devices results in a highly inefficient utilization of the storage space—regardless of which page within the block is the target of our write operation, we must always skip at least one page and create a hole (except when starting to write at the very first page of a block).
There is thus a need for, and it would be highly advantageous to have, an improved flash management system, capable of dealing with power interruptions to NAND flash media 12 that are based on multi-level cells, that is efficient for a broad class of flash devices.