A solid state drive (SSD) is a data storage device that utilizes solid-state memory to retain data in nonvolatile memory chips. NAND-based flash memories are widely used as the solid-state memory storage in SSDs due to their compactness, low power consumption, low cost, high data throughput and reliability. SSDs commonly employ several NAND-based flash memory chips and a flash controller to manage the flash memory and to transfer data between the flash memory and a host computer.
While NAND-based flash memories are reliable, they are not inherently error-free and often rely on error correction coding (ECC) to correct raw bit errors in the stored data. Various mechanisms may lead to bit errors in flash memories, including noise at the power rails, voltage threshold disturbances during the reading and/or writing of neighboring cells, retention loss due to leakage within in the cells and tunneling. Error correction codes (ECC) are commonly employed in flash memories to recover stored data that is affected by such error mechanisms. In operation, ECC supplements the user data with parity bits which store enough extra information for the data to be reconstructed if one or more of the data bits are corrupted. Generally, the number of data bit errors detectable and correctable in the data increases with an increasing number of error bits in the ECC. In many memory devices, data is stored in a memory location of the memory device along with the ECC for the data. In this way, the data and the ECC may be written to the memory location in a single write memory operation and read from the memory location in a single read memory operation. ECC is typically implemented in the flash memory controller.
NAND flash memories are based on floating gate storage. In floating gate storage technologies, two logic states are achieved by altering the number of electrons within the floating gate. The difference between the two logic states (1 and 0) is on the order of few electrons and is decreasing as the floating gate storage technology advances. The decreasing number of electrons responsible for the difference between the two logic states results in an increased probability of errors in the flash memory cell requiring more error correction. The fraction of data bits that are known to be corrupted, and therefore contain incorrect data, before applying the ECC is referred to as the raw bit error rate (RBER). As a result of the advances in the floating gate storage technology, the RBER for a flash page of memory cells is increasing and at technologies with feature sizes in the 1× range (below 20 nm) is nearing the Shannon Limit of the communication channel. The increased probability of errors in the stored data results in an increase in the error code correction necessary to correct the bit errors in the flash memory.
The error rate observed after application of the ECC is referred to as the uncorrectable bit error rate (UBER). The acceptable UBER is often dependent upon the application in which the SSD is employed. In the case of price sensitive, consumer applications, which experience a relatively low number of memory accesses during the SSD product lifetime, the SSD may tolerate a higher UBER as compared to a high-end application experiencing a relatively high number of memory accesses, such as an Enterprise application.
One type of error correction coding often employed in a flash storage controller is a Bose-Chaudhuri-Hochquenghem (BCH) error correction. Typically, a target UBER for an SSD ranges between 10−15 and 10−16, and the BCH error correction capability is chosen based upon this target UBER. However, due to the increased RBER of the NAND-based flash memory technology, the BCH error correction currently employed in the art for the recovery of data errors in a NAND-based flash memory is impractical to meet the target UBER.
Another type of error correction coding that may be employed in a flash storage controller is a low-density parity-check (LDPC) error correction coding. An LDPC code is a linear error correcting code having a parity check matrix with a small number of nonzero elements in each row and column. LDPC codes are capacity-approaching codes that allow the noise threshold to be set very close to the Shannon limit for a symmetric, memory-less channel. The noise threshold defines an upper bound for the channel noise, up to which the probability of lost information can be made as small as desired. LDPC error correction is superior to BCH error correction, with LDPC codes being capable of producing a UBER that is very near the Shannon limit with a lower code rate than is required using BCH error correction. However, LDPC codes may exhibit an error floor that limits the performance of the LDPC error correction. While it is known that the UBER steadily decreases as the signal-to-noise ratio condition of the channel improves, for LDPC codes there exists a point after which the rate of decrease in the UBER flattens. This region is commonly referred to as the error floor region for LDPC error correction. To guarantee a target UBER of between 10−15 and 10−16 with LDPC error correction, it is necessary to know the value of the error floor. The error floor for LDPC cannot be mathematically determined and simulation is necessary to identify the value of the error floor. However, with modern technology, it is not possible to simulate up to 10−16 to identify the value of the error floor, and as such, a target UBER of 10−16 cannot be guaranteed with LDPC error correction.
Accordingly, what is needed in the art is an improved flash controller that is capable of meeting the target UBER for a nonvolatile memory storage system in the presence of an error correction code error floor.