A typical SSD (solid state drive, device, or disk) such as a flash memory type of SSD stores data in blocks. Each block contains some number of pages. An SSD is addressed linearly using logical block addresses (LBAs). A mapping table maps logical or virtual addresses to physical addresses. The mapping table effectively translates an address specified in a request for data into the correct physical location of the data on an SSD.
When an existing version of data stored in a block of an SSD is updated, the new (updated) version of the data is written to a different block, and the old (existing) version of the data is left unchanged in the first block. The mapping table is updated when the new version of the data is stored, so that the proper location of the current (most recent) version of the data can be correctly identified. The data in the first block will remain there until it is erased and/or replaced with other new data, which might not occur for some period of time.
Data integrity is maintained in SSDs using a variety of techniques. Error correction code (ECC) protects against read errors as a result of hardware errors. A cyclic redundancy check (CRC) ensures that the data that is returned when it is read from an SSD is the same as the data that was written to the SSD.
Other techniques attempt to ensure that the data is retrieved from the correct location and that the current version of the data is retrieved. That is, in the example above, techniques are employed to help ensure that the current version of the data is retrieved from the second block instead of the old version of the data in the first block. These techniques are effective for the most part but may not detect a type of data corruption known as “silent corruption.” With silent corruption, a loss of data integrity may not be detected, and so the data may appear to be valid when actually it is not.
More specifically, there can be rare events—referred to as soft errors or single event upsets (SEUs)—that can prevent the mapping table from being properly updated when a new version of the data is stored. For example, an SEU can be the result of a cosmic event or cosmic ray that interrupts or perturbs the update process.
If the mapping table is not properly updated, it may point to a previous and now outdated version of a set of data by mistake, or it may point to a location that has been erased or that contains different data that is unrelated to the data previously stored at that location. Consequently, a request for a particular set of data will be mapped to an incorrect location, and the data at that location will be read and returned instead of the data that is actually wanted. Techniques like CRC will not detect that incorrect data is being returned, because the CRC will indicate that the returned data is correct but will fail to indicate that the returned data is not the data that is actually wanted. Thus, the user (e.g., host or application) will use the returned data, unaware that the data is not the wanted data. This is the type of data corruption referred to above as silent corruption.
Silent corruption can be detected, for example, by creating redundant sets (e.g., up to three sets) of the data of interest each time the data is updated. When the data is to be used, two sets of data can be read and compared; if they do not match, then the third set can be used to determine which of the other two sets is current. However, such an approach significantly increases the memory resources needed to store the data, and the extra reads and writes and the comparisons of sets of data increase the burden on processing resources and bandwidth, especially considering the large number of transactions and the very large data sets (sometimes known as “big data”) that are becoming more commonplace as a byproduct of advances in data collection and storage.