The present invention relates generally to computer memory, and more specifically to bad block management in not-and (NAND) flash memory.
Phase-change memories (PCMs) and flash memories are examples of non-volatile memories with limited endurance (also referred to as a “limited life”). Such memories have limited endurance in the sense that after undergoing a number of writing cycles (RESET cycles for PCM, program/erase cycles for flash memory), the memory cells wear out and may no longer be able to reliably store information. In addition, flash memory may be affected by errors in surrounding pages that are introduced while writing data to a page. These types of errors are referred to as disturbance errors.
Contemporary not-and (NAND) flash memory devices do not support page level erases. The absence of page erases implies that once a page is written, it cannot be rewritten until the entire block (e.g., made up of sixty-four pages) is erased. If a logical address corresponding to a page requires refreshing, this is accomplished by marking the page as invalid and mapping the logical block address to a different physical page. Disturbance errors, however, may cause the bits of the erased pages to appear to be written (e.g., changed from ‘1’ to ‘0’ for single-level cell flash). Because individual pages cannot be erased, disturbance errors in blank pages may cause faulty values in data that is subsequently written to those pages.
In addition, disturbance errors in memory may affect previously written pages by flipping bits from the programmed value to a new value. Typically these errors are undetectable and only manifest themselves once the data is read from memory.
Cells in NAND flash (also referred to as memory) suffer from the problem of wear, wherein the cell tunnel oxide becomes increasingly defective with program/erase cycles and the associated charge flow through the oxide is altered. The result of the cell tunnel oxide is that some cells may become unable to hold a charge, or, may be only able to hold a charge for a short retention time. When enough defective or low-retention cells exist in a block of memory, the error rate in the block of memory may become high enough that data written in the block cannot support a required level of reliability. Due to physical non-uniformities in the memory, different blocks may go “bad” at different times. Certain blocks of memory may even be bad when the device is new.
Conventional solutions to the problem of managing bad blocks involve the detection of bad blocks using various statistics such as block bit error rate (BER), followed by decommissioning blocks which do not meet reliability requirements, and in some cases, replacing them with spare blocks.