1. Field of the Invention
Embodiments of the invention generally relate to electronics, and in particular, to memory controllers, such as to solid-state drive memory controllers.
2. Description of the Related Art
Flash memory is a form of non-volatile memory. A memory cell in flash memory can be a single-level cell (SLC), which encodes one bit of information per cell, or a multi-level cell (MLC), which encodes two or more bits of information per memory cell. Typically, a flash memory implementation using MLC is much cheaper than a flash memory implementation with SLC. Further, a flash memory device is arranged into pages and blocks. Data can be written to and read from flash memory in pages. A group of pages known as a block corresponds to the smallest erasable unit of flash memory.
Over time, programming and erasing flash memory causes a variety of defects that degrade the performance of the flash memory cell. In particular, MLC memory cells have much lower program/erase cycle lifetimes than SLC memory cells, which can be a problem in an enterprise application. This degradation, along with other noise effects, cause the signal-to-noise ratio of the memory cell to change over time. After the signal-to-noise ratio has fallen to a certain level, the flash memory device is typically no longer reliable. Manufacturers typically specify a number of program/erase cycles over which the properties of their flash devices are guaranteed.
As flash memory technologies become denser with decreasing process technology, the amount of charge stored on a floating gate of a memory cell tends to fall, crosstalk between cells tends to rise, insulation material between memory cells become thinner, and so on. Taken together, these effects tend to cause the signal-to-noise ratio of flash memory to decrease with each passing generation.
Flash memory devices require the use of a form of Error Correction Coding (ECC) to detect and correct the errors that inevitably occur. ECC, and in particular the ubiquitous Reed-Solomon or the Bose, Chaudhuri, and Hocquenghem (BCH) hard-decision codes, are widely used in electronics as a way of mitigating low signal-to-noise ratio in communications and storage media. With ECC, redundant information is stored or transmitted alongside the regular information bearing data, to permit an ECC decoder to deduce the originally transmitted or stored information even in the presence of errors.
A conventional approach to error management for MLC flash memories has been for the flash memory chip manufacturer to specify a particular strength of ECC code, typically a one-dimensional BCH code, capable of correcting a certain number of bits per certain size of sector, for example 24 bits per 1024 bytes. Examples of allocations of flash bytes in a typical flash page using vendor-specified BCH are shown in FIGS. 1A and 1B. FIG. 1A illustrates a standard layout for a page of flash memory. A region is provided for user data 101 or information data in the main area of the flash page, and a spare area 102 is provided to provide room for ECC parity and other data. In practice, the user data 103, metadata 104, such as data integrity feature (DIF) and Journaling Engine management data, and the manufacturer's recommended amount of ECC parity data 105 may be stored interleaved within the page as illustrated in FIG. 1B.
So long as the specified error correction is provided by the flash controller, the flash memory chip manufacturer guarantees a certain number of Program/Erase (P/E) cycles over which the flash memory chips will store and retain data, with no more errors than the ECC can correct, with a probability of uncorrectable errors occurring less than some acceptable risk for the end user. For example, consumer grade flash-based drives may tolerate a relatively high uncorrectable error rate. However, in an enterprise storage environment, a relatively low uncorrectable error rate is applicable, for example, 1×10−16 (1E-16).
However, conventional approaches of applying ECC to flash memory can be inefficient at achieving relatively low uncorrectable error rates over the total service life and traffic loads that can be required in an enterprise storage environment, such as in a server.