Data integrity is an important feature for any data storage device and data transmission. Use of strong error-correction codes (ECCs) is recommended for various types of data storage devices include NAND flash memory devices. ECCs are also frequently used during the process of data transmission.
Error correcting code (ECC) refers to codes that add redundant data, or parity data, to a message, such that the message can be recovered by a receiver even when a number of errors were introduced, either during the process of transmission, or storage. In general, the ECC can correct the errors up to the capability of the code being used. Low-density parity-check code (LDPC) is an example of ECC.
In a data storage device, such as a NAND flash memory device, data can be written to and read from wordlines of the data storage device. Wordline failures can be common in NAND. In existing systems, there are various techniques to handle these failures. Typically, the data bits are decoded with a decoder, such as an LDPC decoder. If the decoding fails, a chip-kill is used. Chip-kill refers to an ECC computer memory technology that protects against memory failures.
In existing systems, a chip-kill involves an XOR over all data in a superblock. However, if there are two wordlines failing, recovering data from hard information obtained from the channel becomes challenging. In this scenario, soft information can be obtained from the channel and bits can be flipped at locations for the failed wordlines where soft information provides strong information about the bit. Errors happening at weaker soft information can be corrected through the LDPC decoder. However, two wordlines failing due to physical defect where no channel information can be obtained cannot be corrected using the existing schemes for chip-kill. Simply, the decoding fails and data written to the data storage device may not be recoverable.