A solid-state drive (SSD) is composed of the storage media, the controller and other peripheral components. The storage media used in SSDs is NAND flash memory. A flash memory die or plane may have failures due to various causes such as manufacturing defects, voltage pump failure, or some short in the NAND circuitry. Overall the failure rate of a flash memory die/plane is very low, in the order of parts per million. However, the damaging effect of die/plane failures is significant for a SSD because the capacity lost, the potential data loss, and/or the functionality loss is large. Assuming one plane contains 1024 blocks and each block has a capacity of 4M bytes, a plane failure leads to a 4 GB capacity loss. A dual-plane die failure leads to an 8 GB capacity loss. In a low-end SSD with 100 GB user capacity, such capacity loss nearly renders the drive defective. As memory density increases the potential loss of capacity due to a die/plane failure also increases. In an enterprise SSD, such capacity loss leads to less over-provisioning, which has a negative impact on performance. Without any dedicated die failure detection or handling algorithms, a die failure can be treated as multiple block failures. A SSD would have to rely on bad block detection and handling. A SSD may take a long time to detect and handle bad blocks in the failed die/plane individually.
It would be desirable to implement a die failure, plane failure and/or memory unit failure detection in a SSD controller and/or a drive.