Data bits in NAND memories are stored as charges injected to the floating gate cells of MOSFETs. The design of NAND architecture allows the existence of bad blocks in a certain percentage, either due to imperfections in the manufacturing process or developed during its use. A block is marked invalid when a memory location turns bad, which can occur due to a number of factors (e.g., write or program disturb, read disturb, endurance failure or loss of gate charge). As memory errors are a fact of life with NAND memories, Error Correction Codes (ECC) are widely used for error detection and correction, but data retention is of primary concern at high temperature applications, when errors non-correctable with common ECC algorithms may develop. Manufacturer specified data retention has been quoted as 10 years at 25° C., but discharge rate is temperature-dependent as it is a physical phenomenon governed by the Arrhenius equation given by:
  AF  =      ⅇ          -                                    E            a                    k                ⁡                  [                                    1                              T                2                                      -                          1                              T                1                                              ]                    where AF=Acceleration factor, Ea=Activation energy (0.6 eV for data retention), k=Boltzmann's constant (8.623×10−5 eV/K), T1=Application junction temperature in Kelvin, and T2=Accelerated stress junction temperature in Kelvin. The equation gives the temperature behavior in the table below, clearly showing a steep decrease in reliability of the data stored in the memory as temperature rises.
Retention timeTemperature in deg C.in months251203082355640394528502055146010658706
This may cause data corruption beyond the capability of an embedded ECC algorithm, leading to catastrophic system failure. Values for the Arrhenius acceleration factor show that likely to happen in less than one year if an appliance is exposed to an environment such that the temperature of the silicon die of the memory (junction temperature in the equation) reaches about 60° C.
Existing solutions make use of error correcting code (ECC) algorithms that can detect and correct a small number of bit-flips per page. In order to correct a higher number of bit-flips, more complex ECC algorithms are required and are usually implemented by software. Since ECC calculation is necessary on every NAND access, software ECC algorithms affect the overall availability of the CPU, while they do not prevent data retention loss if they correct the errors during reads without rewriting the page. Thus, data retention loss will eventually develop in the long run as result of the physical characteristics of the NAND device and the operating temperature.