Flash memory devices have been known for many years. Within all flash memory devices, NAND-type memories differ from other types (e.g. NOR-type), among other characteristics, by the fact that a certain amount of information bits, written to the memory, may be read from the memory in a “flipped” state (i.e. different from the state that the original bits were written to the memory).
In order to overcome this phenomenon and to make NAND-type memories usable by real applications, it is a common technique to use ECC in conjunction with these memories. A general overview of using ECC in flash memories is described below which includes the following steps:
(1) Before writing data to the memory, an ECC algorithm is applied to the data in order to compute additional (i.e. redundant) bits, which are later used for error detection and correction. These redundant bits are often called parity bits or parity. A combination of the data input into an ECC module and the parity output by that module is called a codeword. Each different value of input data to an ECC module results in a different codeword.
(2) The entire codeword (i.e. the original data and the parity) is recorded to the flash memory. It should be noted, that the actual size of NAND-type flash memory is larger than the size of the original data, and the memory is designed to accommodate parity as well.
(3) When the data is being retrieved from the memory, the entire codeword is read again, and an ECC algorithm is applied to the data and the parity in order to detect and correct possible “bit flips” (i.e. errors).
It should be noted that the implementation of ECC may similarly be done by hardware, software, or a combination of both of them. Furthermore, ECC may be implemented within a memory device, a memory device controller, a host computer, or may be “distributed” among these components of a system.
Another well-known feature of flash memories is that data may only be programmed to the memory after the memory has been erased (i.e. data in the memory may not be overwritten, but rather erased and written again). The erase operation is performed on relatively large amounts of memory blocks (called erase blocks), and results in setting all the bits of the portion of erased memory to a logic value of one. This means that following an erase operation of a block of a NAND-type memory device, all the pages of that block will contain 0xFF (i.e. hexadecimal FF) data in all their bytes.
If further data is to be programmed to the erased page, the bits which have “zero-logic” (i.e. logic values of zero) will be programmed, while the bits which have “one-logic” (i.e. logic values of one) will remain in an “erased” state.
A vast majority of ECC schemes used with NAND-type flash memory devices have “linear” behavior, which means that for a data word consisting of “all-zero” data bits, all the parity bits have zero-logic as well (i.e. a codeword of all-zero logic, where all the bits have zero-logic, is a legal codeword). However, many of these codes are not “symmetrical” (i.e. the “0xFF” codeword, which is a codeword with both “all-one” data bits and “all-one” parity bits, is not a legal codeword).
As a simple example of the situation mentioned above, one may consider a simple even parity added to a byte of data. While an all-zero codeword (i.e. 0x00 plus zero-parity) is legal, an all-one codeword (i.e. 0xFF plus one-parity) is illegal. This situation may create a logic problem for system implementation as follows. If the system attempts to read a page which happens to be erased, and to apply ECC to the page, the ECC will “detect” that the codeword is wrong and will try to correct the codeword. If the ECC succeeds in correcting the all-one data, incorrect information would be presented to the application.
One may wonder why the system would read erased pages. The reason for this situation arising is that when the system “wakes-up” from power interruption, the system has no a priori knowledge of the location of the data in the flash memory. Therefore, the system has to perform a search of the flash memory medium in order to locate the written data and to reconstruct its RAM-resident databases, which will then allow the system to access data on the flash memory in a quick and efficient way.
During such a search as mentioned above, erased pages may be read in the process. When these pages are read, they should be identified as having been erased in order to enable correct construction of the RAM tables.
It is clear from the above discussion that it would be beneficial to system performance if erased pages could be handled correctly by ECC. By “handled correctly”, it is meant that the ECC will not consider an erased page to have errors. Moreover, it would be beneficial that even in the event that some erased bits of the erased page are accidentally flipped to a “programmed” state, which may occur in practical flash memory devices due to various “parasitic” phenomena, the ECC should correct the affected bits and provide the system with “erased” (i.e. all 0xFF) data.
In some flash memory devices, the erasure procedure actually consists of two stages: (1) all the cells in a block are programmed to the high voltage-level (i.e. zero state), and (2) only after this step has occurred, an erase voltage is applied to the block. This procedure removes the charge from the cells, and converts the cells to the erased state.
The reason for such a two-stage process is to attempt to make all the cells in a block go through the same history of programming and erasing, which ensures that all cells in a block have relatively the same wear effects. In addition, this two-stage erasure procedure helps to make the voltage distributions of the cells narrower, which results in more reliable programming.
If the device power is interrupted following the first stage of such an erasure operation (i.e. following programming all the cells to a zero state), the pages of the block will remain programmed, and will be read upon power restoration as all-zero states. In this case, the ECC will report the correct data of 0x00 for the entire page. This may result in the flash-memory management algorithm, which attempts to reconstruct the flash memory database, being mislead.
Although the probability of the occurrence of such an event is not high (because power interruption would have to occur immediately following completion of the first stage of the erasure operation, but prior to initiation of the second stage), it would be beneficial for the system to have an “operation error” indication for this scenario.