The present invention relates to a method and a device for performing data refresh operations in non-volatile memories based on temperature-related conditions.
New memory components (e.g. the data retention of NAND-type memory devices) are sensitive for temperature increase. In high temperature environments such memory devices or the data stored therein, respectively, may become corrupted permanently in relatively short time. High temperatures will also decrease the amount of allowed memory program/erase (P/E) (or write/erase (W/E)) cycles over the lifetime of the memory. As the NAND process development continues to proceed towards smaller and smaller structural widths (today 56 nm, soon 43 nm and developed towards 32 nm) the reliability will become an even bigger issue. This applies to the maximum program/erase (P/E) cycle count, data retention, read/program disturbance etc. Reading an address very frequently may corrupt addresses nearby, which is referred to as read disturbance. Among these parameters especially the data retention is affected by the temperature.
Being subjected to high temperatures can occur in mobile devices, depending on the type of device, the ambient temperature, and the usage conditions.
Data corruption can be avoided by refreshing stored data. However, with common nonvolatile memories, e.g. NAND-type memories, this will involve re-writing stored data to a different location inside the memory or by re-writing it to same location after an erase of the location. As the total number of writing/erasing operations before the memory will fail is usually limited for non-volatile memories, every data (re)writing/erasing operation shortens the lifetime of the memory. Thus such refreshing can only be performed at the cost of reducing the lifetime of the memory. It is therefore not possible to simply perform regular data refresh operations that are suitable for maintaining data integrity under all circumstances, as the frequency of such refreshes would severely reduce the memories lifetime.
Volatile memories like DRAM (Dynamic Random Access Memory) and the like have a practically infinite number of possible read/write (or program)/erase cycles. Such memories rely on rather frequent regular refresh cycles in order to retain data stored therein, because otherwise they lose their data content. The time between such refresh cycles is comparatively short.
In contrast non-volatile memories do not rely on frequent refresh cycles, but on the other hand they have a somehow limited write/erase (or program/erase) count. That is, each cell of a memory of this type, e.g. flash memory, only supports a finite number of write/erase operations before the cell will fail. After that, data integrity of the cell cannot be ensured anymore. This is usually referred to as wear of the memory. Worn cells can be replaced in a transparent manner by fresh or at least still intact cells, for example by the memory controller performing a corresponding re-routing operation. This is referred to as defect handling.
To make this possible non-volatile memories usually have a certain amount of surplus cells exceeding the nominal capacity of the memory. As long as there are still surplus cells left the memory is still functional, even when a certain amount of cells have already failed. The number of write/erase operations to a non-volatile memory may be within the range of some thousands cycles to some ten thousand cycles, depending on the type of memory, the amount of surplus cells and the defect handling algorithm.
In order to distribute write/erase operations as equal to all cells as possible, additionally so-called wear-levelling techniques are employed. These techniques shall ensure that the number of remaining write/erase cycles for memory cells is as equal as possible, so that re-routing to fresh auxiliary memory cells can be postponed as long as possible.
In non-volatile memories the data retention time, i.e. the time interval for which stored data will maintain its integrity, is not per se infinite. The data retention time depends, among other things, on the temperature of the memory. Higher temperatures can significantly decrease the data retention time and finally lead to data corruption because a cell cannot hold its data integrity. This will require actions to ensure data integrity in the volatile memory.
In DRAM and like memories the problem of decreasing data retention time can easily be handled by performing a data refresh cycle before the data cannot be retained anymore. The only drawback connected therewith is increased power consumption, as each refresh cycle requires electrical power. This refresh is a low-level refresh, meaning that a complete memory, e.g. DRAM module, is provided with a refresh pulse. The refresh is not data-dependent, that is, it is performed independent of the actual data content of the DRAM module, and for all cells of the DRAM module.
A similar approach would cause a problem in non-volatile memories like flash memory, because the number of write/erase cycles is more strictly limited compared to DRAM. Performing a simple temperature-dependent complete refresh would—due to the properties of non-volatile memory—require a complete reading, erasing and re-writing of the complete content. As each write/erase access and with respect to read disturbance also read access to non-volatile memory decreases the remaining access cycles this would reduce the remaining number of write/erase cycles for the memory.
Therefore, although the short time data integrity could be ensured in such a manner, the volatile memory would rather soon approach a state where the memory cells fail due to exceeding of their maximum write/erase cycles. However, long time data integrity will suffer sufficiently.
Thus the invention proposes a method of handling data integrity in non-volatile memories that can minimize or even avoid temperature-dependent degradation of the memory due to required data refresh, while at the same time maintaining data integrity as high as possible. In high temperature environments where the data corruption risk is the highest and the data retention time is the shortest the memory refresh can be performed so that the minimum amount of write/erase cycles is spent while still keeping stored data valid without high corruption risk.
According to a first aspect of the invention a method is provided, comprising                measuring the temperature of at least one location of a non-volatile memory;        determining if said temperature measurement indicates that the data retention time of data stored at said at least one location is reduced below a threshold; and        re-writing said data to said non-volatile memory in response to a positive determination.        
Due to the decreasing data retention time at high temperatures the data refreshing is needed to keep stored data intact. At higher temperatures the data refreshing is needed more often and the refresh rate needs to be increased. On the other hand amount of the write/erase cycles are limited over the lifetime of the memory and thus cause wear of the memory. By using the temperature based refresh of the invention all data can be kept safe. Also a longer lifetime is ensured as the data refresh is done only when needed.
It should be noted that in the context of the invention the term “measuring the temperature” is not limited to the direct measurement, e.g. at a temperature sensor, but is intended to also include obtaining an indication allowing to determine the temperature, e.g. receiving an output/temperature indication from the temperature sensor or evaluating another property of the memory that allows to determine the temperature thereof. In the same sense “temperature measurement” is intended to also include an evaluation of such a temperature indication, which could for example be understood as an indirect temperature measurement. In this manner “measurement” of a temperature is intended to include direct as well as indirect temperature measurement or determination, respectively. Depending on the hardware implementation it may also be necessary to take into account appropriate offset values for the temperature, for example if a temperature sensor is located in the close vicinity of the memory compared to the even closer location within the memory die itself.
According to an exemplary embodiment a positive determination is made if                said temperature exceeds a temperature threshold Tthres a pre-determined number of times;        said temperature exceeds a temperature threshold Tthres a pre-determined number of times per pre-determined time interval; or                    if                        
      ∑    1    n    ⁢          ⁢      Δ    ⁢                  ⁢          T      n                                           exceeds a pre-determined threshold;                        wherein        
         {                                                      Δ              ⁢                                                          ⁢                              T                n                                      =                                          T                n                            -                              T                thres                                                                                                        if                ⁢                                                                  ⁢                                  T                  n                                            >                              T                thres                                      ;                                                                          Δ              ⁢                                                          ⁢                              T                n                                      =            0                                                                              if                ⁢                                                                  ⁢                                  T                  n                                            ≤                              T                thres                                      ;                                               and        wherein        
n is an integer value indicating the number of temperature measurements;
Tthres is a temperature threshold; and
Tn is the temperature measured in the nth temperature measurement.
In this embodiment it can be taken into account not only if, but also how a temperature threshold is exceeded. Different parameters can be considered to judge if data retention time is degraded.
In a first example the number of times a temperature threshold is exceeded is used to determine that data retention time is below a threshold. While a single occurrence of a high temperature may not be harmful a higher number thereof, for example ten occurrences, may indicate degraded data retention time.
Further, the frequency of high temperature events may be considered. Using this embodiment a single exceeding event per day, week, month or similar relatively long time interval may be regarded to be irrelevant. However, if the temperature exceeds the threshold a couple of time within an hour, quarter hour, minute or like relatively short time interval this should be considered to indicate degraded data retention time.
Still further not only the number and frequency may be taken into account, but also how much the temperature threshold is exceeded. For example, three temperatures exceeding the temperature threshold by one degree Celsius each may be considered to be irrelevant or at least less harmful than a single temperature measurement that exceeds the threshold by as much as ten or more degrees Celsius. Therefore according to an example embodiment the sum of ΔTn over all measurements n is considered. In order to consider only temperatures exceeding the threshold ΔTn is defined as the difference between measured temperature Tn and temperature threshold Tthres if Tn>Tthres and defined as 0 if Tn≦Tthres. In this manner the number of temperature events exceeding the threshold is considered together with the amount the threshold is exceeded.
It should be noted that these three decision parameters may be combined with one another, possibly with different thresholds for each. For example, a total number of ten temperatures exceeding a first threshold Tthres1 will cause a positive determination that data retention time is degraded below the desired value. At the same time a total number of three temperatures exceeding a second threshold Tthres2 (wherein Tthres2 may be, but is not necessarily equal to Tthres1) within the time interval of five minutes will also lead to a positive determination. Still further, if
      ∑    1    n    ⁢          ⁢      Δ    ⁢                  ⁢          T      n      as defined above exceeds a third threshold (which may be different from both Tthres1 and Tthres2 because it is a sum of temperature differences) it will also be derived that data retention time has been degraded beyond tolerable values. That is, in this example a positive determination is made if any of these events occurs. Other combinations are possible as well.
According to an exemplary embodiment the method further comprises                maintaining information about the number of write and erase operations performed on one or more locations of said non-volatile memory including said at least one location;wherein said temperature threshold Tthres is dependent on the number of write and erase operations on a single location and/or a local or global peak number of write and erase operations on a plurality of locations and/or the average number of write and erase operations on a plurality of locations.        
In non-volatile memories, for example of the NAND-type, the data retention time may be influenced not only by the temperature alone, but additionally also by the number of write/erase operations already performed on a particular memory cell or generally location. For example, a temperature of 85 C.° may not be alarming if the program/erase count of a particular location of the memory is 10, but may be alarming of it has a value of 1000. Therefore, according to this embodiment the temperature threshold is not constant, but takes into account the program/erase count, in such a manner that generally higher program/erase counts will decrease the temperature threshold. For example, the threshold may be lowered per write/erase operation by a small constant amount, i.e. linearly, or even by a progressively increasing value.
According to embodiments of the invention, for the temperature threshold one or more parameters can be taken into account, either alone or in any possible combination. These parameters include the number of write and erase operations on one or more single locations, local or global peaks of the number of write and erase operations on a plurality of locations, and an average value of the number of write and erase operations on a plurality of locations. To put it another way, according to this embodiment for each temperature measurement value Tn the corresponding temperature threshold Tthresn is a function of the program/erase count P/E: Tthresn (P/E). Generally Tthresn decreases with increasing P/E value(s).
Depending on Tthresn the same temperature may be considered not to exceed the threshold for low P/E values, while it is considered to exceed the threshold for higher P/E values.
A reason for taken into account the above mentioned different variations of the P/E count (e.g. local/global peak value(s), single location value(s), average value) is that the used wear levelling mechanism can play a role. For example, considering the wear levelling mechanism has a relatively poor efficiency then very high peak P/E values compared to the average P/E value may occur. In this case considering only the P/E count(s) of (a) single location(s) can achieve good results, as only the memory blocks with high P/E value will be refreshed.
However, if the wear levelling mechanism is instead rather efficient Δ P/E may be very low, e.g. all blocks being within 100 P/E cycles while the maximum specified value is around 5000, it might be disadvantageous to perform refresh only based on peak values. This could possibly lead to unjustified re-writing actions that may substantially decrease the remaining P/E cycles while not offering a corresponding increase in temperature-dependent data retention time. In this case it might therefore be better, if there is an indication that the temperature has been high for a long enough period, to consider the average P/E count for the temperature threshold instead. Then the assumption can be made that all blocks are in similar condition and it makes more sense to refresh all of them is Δ P/E is small enough.
Furthermore this embodiment allows providing a safety mechanism. If the P/E count of certain memory locations exceeds a safety limit these memory locations can be locked, i.e. set to read-only, or even be forbidden from any further use. Of course data will be relocated before locking such locations. According to another exemplary embodiment a location can be set to read-only status if and for as long as the temperature is considered too high, and be reset to read/write status if the temperature has lowered below safe limits again.
According to an exemplary embodiment said re-writing comprises                reading data stored in said at least one location of said non-volatile memory; and        writing said data to a different location of said non-volatile memory.        
This embodiment allows writing data that is considered to be endangered by degraded data retention time into another location of the non-volatile memory. Such an embodiment may take advantage of a property of many non-volatile memory types, namely that each location may only be written to after a previous erase operation (the first initial write may be done without preceding erase operation). Erase operations take time during which the memory is not ready for read or write operations, i.e. the latency of memory accesses is influenced by the required erase operation. However, erase operations may be done in the background while no other accesses are performed.
If due to the background erasing at least one other location of the memory large enough to receive the data to be relocated is already in an erased/programmable state, this embodiment is particularly advantageous, because the data can be relocated by performing only a single operation which is data write (wherein the mandatory read operation is not counted). The original location of the relocated data can then again be erased in the background, i.e. without performance loss with respect to access time of the memory.
This embodiment also can be used if no continuous erased/programmable location is available, but if the total amount of erased/programmable memory locations is sufficient for receiving the data to be relocated. This will still allow performing erase operations in the background, but it achieves this at the cost of increasing fragmentation of the data.
It should be noted that the term “re-writing” does not necessarily mean that a read operation is immediately followed by a writing operation. Further steps or operations can be performed between the read and the write operation, including but not limited to a buffering operation as explained later on. The time interval between these operations can be short, but can also be substantially longer, including but not limited to a time delay to allow the temperature to decrease below a safety limit.
According to an exemplary embodiment said different location is selected based on information about the number of write and erase operations performed on said different location.
In order to provide optimal wear-levelling of the non-volatile memory, the number of write and erase operations of locations of the memory should be considered when selecting a new memory location for relocating data. That is, locations having a lower number of write and erase operations will be favoured.
According to an exemplary embodiment said re-writing comprises                buffering data stored in said at least one location of said non-volatile memory;        erasing said at least one location; and        re-writing said buffered data to said at least one erased location.        
This embodiment is advantageous particularly in implementations where an additional buffer memory is available for buffering data to be relocated, for example a volatile memory like a DRAM memory or like. If the data to be relocated can not be simply relocated to another memory location—for example as the remaining free memory space is insufficient—it will be required to re-write the data to the same location again. Due to the properties of non-volatile memories this requires a preceding erase operation. During this erase operation the data thus have to be buffered. As the erase time is comparatively short, this may be accomplished by using an additional buffer memory, for example a DRAM memory found in many electronic devices.
It should be noted that the two previously described embodiments may be combined as well. For example, if the free memory space is insufficient to accommodate the complete data to be relocated, the free memory space may still be used to relocate a first part of the data to be relocated, while the remaining second part will be buffered and then be rewritten to its original location. In this manner the time lost due to required erase operations is at least minimized compared to a complete buffering, erasing and rewriting operation. That is, at least the memory amount connected to the first part of the data will not have to be erased immediately.
According to exemplary embodiments said data retention time threshold is a pre-defined and/or freely programmable threshold. For example the data retention time threshold can be implemented as a value that is initially pre-defined, and that can optionally be re-programmed during operation. The threshold value can also be freely programmable at all times, i.e. also initially.
According to a second aspect of the invention a computer program product is provided, comprising code sections for instructing a device to perform the method described above when said computer program product is running on said device. According to an exemplary embodiment said code sections are stored on a computer-readable medium.
According to a third aspect of the invention a module is provided, comprising                a non-volatile memory;        a data interface configured for accessing said non-volatile memory;        a temperature sensor configured for measuring the temperature of at least one location of said non-volatile memory;        a controller connected to said non-volatile memory and said temperature sensor, said controller being configured for determining if said temperature measurement indicates that the data retention time of data stored at said at least one location is reduced below a threshold and for re-writing said data to said non-volatile memory in response to a positive determination.        
It should be noted that the term “memory module” in the context of the invention can refer to an external, i.e. removable memory module, but as well to an internal or embedded memory module that is not removable, but mounted within an electronic device.
According to an exemplary embodiment a positive determination is made if                said temperature exceeds a temperature threshold Tthres a pre-determined number of times;said temperature exceeds a temperature threshold Tthres a pre-determined number of times per pre-determined time interval; or        if        
      ∑    1    n    ⁢          ⁢      Δ    ⁢                  ⁢          T      n                       exceeds a pre-determined threshold;        wherein        
         {                                                      Δ              ⁢                                                          ⁢                              T                n                                      =                                          T                n                            -                              T                thres                                                                                                        if                ⁢                                                                  ⁢                                  T                  n                                            >                              T                thres                                      ;                                                                          Δ              ⁢                                                          ⁢                              T                n                                      =            0                                                                              if                ⁢                                                                  ⁢                                  T                  n                                            ≤                              T                thres                                      ;                                               and        wherein        
n is an integer value indicating the number of temperature measurements;
Tthres is a temperature threshold; and
Tn is the temperature measured in the nth temperature measurement.
According to an exemplary embodiment                said controller is further configured for maintaining information about the number of write and erase operations performed on one or more locations of said non-volatile memory including said at least one location; and        said temperature threshold Tthres is dependent on the number of write and erase operations on a single location and/or a local or global peak number of write and erase operations on a plurality of locations and/or the average number of write and erase operations on a plurality of locations.        
According to an exemplary embodiment said re-writing comprises                reading data stored in said at least one location of said non-volatile memory; and        writing said data to a different location of said non-volatile memory.        
According to an exemplary embodiment said controller is configured for selecting said different location based on information about the number of write and erase operations performed on said different location.
According to an exemplary embodiment said apparatus further comprises                a buffer interface configured to write data to and read data from a buffer memory;wherein said re-writing comprises        buffering data stored in said at least one location of said non-volatile memory in said buffer memory;        erasing said at least one location; and        re-writing said buffered data to said at least one erased location.        
According to an exemplary embodiment said apparatus further comprises                a buffer memory connected to said buffer interface.        
According to an exemplary embodiment said buffer memory is a volatile memory.
According to an exemplary embodiment said temperature sensor is implemented                in said non-volatile memory;        in said controller; or        in said module separate from said non-volatile memory and said controller.        
For example, the temperature sensor can be implemented within the memory die, within the controller or inside the module but neither within memory die or controller.
Depending on the actual location of the temperature sensor it may be required to take into account an appropriate temperature offset. For example, a temperature sensor which forms part of the memory die may be considered to measure the temperature of the memory quite accurately, such that a low or even zero offset is required to obtain the “real” temperature of the memory from the temperature measurement. A temperature sensor in the controller or somewhere else in the memory module separate from the non-volatile memory and the controller, however, may provide a slightly offset temperature measurement, compared to the “real” memory temperature. In order to compensate for this, a corresponding corrective offset may be taken into account.
In other embodiments this could be compensated by corresponding adaptation of the temperature threshold. For example, if the “real” memory temperature is known to be offset by ˜+5° C. compared to the measured temperature value, the temperature threshold could be lowered by a similar amount to compensate.
According to exemplary embodiments said data retention time threshold is a pre-defined and/or freely programmable threshold. The threshold value can for example be initially pre-defined or initially freely programmable, and optionally additionally freely re-programmable during operation.
According to a third aspect of the invention an apparatus is provided, comprising a module as described above, and a host controller configured for accessing said non-volatile memory via said data interface.
According to a fourth aspect of the invention an apparatus is provided, comprising                means for measuring the temperature of at least one location of a non-volatile memory;        means for determining if said temperature measurement indicates that the data retention time of data stored at said at least one location is reduced below a threshold; and        means for re-writing said data to said non-volatile memory in response to a positive determination.        