The present invention relates to methods for built-in detection of the deterioration of reliability in digital memory devices in general, and in removable flash-memory devices in particular.
Digital memory devices are often used as dependable memory devices for important data. As a result of the limited life expectancy and complexity of such equipment, digital memory devices can fail, and cause the loss of valuable data. Non-volatile storage systems include a memory and a control system, which sometimes reside on the same piece of silicon. One of the tasks performed by the control system is error correction.
Error-correction code (ECC) detects occasional errors in the data, generated by the storage element's nature or the characteristics of the operating environment, fixes the error, and delivers the corrected original information upon a user's request. The ECC system has a built-in limitation with regard to the number of errors that can be corrected. Upon encountering an excessive number of errors, the information cannot be repaired, and may be reported as lost, or may be sent with the errors to the user.
Software applications that use memory devices assume that the information is correct until the correction system fails to repair the data. As long as the memory is still fully functional, there is no indication to the user that the memory is deteriorating and approaching failure. It should be noted that the error level is only one of several early-warning symptoms that can be used to predict the life expectancy of a memory prior to failure.
Today, users (and applications) are unaware of the state, or condition, of the stored data. Thus, users are unable to take active measures to reduce the risk of losing data. Such measures include, for example, creating a back-up of the data, migrating the data to new storage media, and rewriting problematic areas of the memory device to other areas.
The current state of the art does not provide a way to report the “health” (i.e. operational-reliability performance) of a memory device. While some prior art methods record the usage of the device (i.e. the number of times the device is written to) for internal load-balancing purposes, the methods are not designed for the purpose of providing an early-warning indication to a user or application. While there is a correlation between the usage of the device and the device's remaining life expectancy, such a correlation is not absolute since there is a natural, random variability among devices (similar to the inaccuracy of predicting a person's life expectancy based on age only).
Methods that attempt to monitor the health of a memory device are known in the art, but are limited to programs that run on the host system. Thus, these methods can only deal with the corrected information, after error-correction methods have been employed to correct the errors. An example of a prior art system is provided in a feature called “Disk Health” that is included in the “Norton System Doctor (NSD)” product available from Symantec Corp., Cupertino, Calif. (also described in Document ID: 2001082218352309, included in the NSD product manual, and found on the Norton web site, www.norton.com). The data is read from the memory device; therefore, the host system is not exposed to the raw data before the data has been corrected. This aspect makes such prior-art methods less sensitive to early stages of deterioration in the health of the memory device.
It would be desirable to give the user of such memory devices an early-warning indication when the health of the device deteriorates and approaches a high probability of failure.
It would be further desirable to have a system, operating according to a method, which resides on a memory device, and detects and reports the actual deterioration of indicative longevity parameters on the memory device before any attempt is made to correct the data. Such a system would be of significant importance for modern multi-level flash-memory devices in which the inherent life expectancy is shorter than in traditional single-level flash-memory devices, and in which the symptoms of aging can be measured without slowing the routine operation.