Hybrid memory systems are memory systems having a mixture of volatile and non-volatile memory types. Hybrid memory systems utilize non-volatile memory components to securely store volatile system data in the event of system fault or power failure, or upon user requests which may in include certain write or programming operations. Typically, the non-volatile memory used in a hybrid memory system is flash memory. The non-volatile memory components are made up of delimited memory portions, for example blocks, whose lifetimes are limited. Beyond these lifetimes, the memory portions can no longer be reliably used to store data, and any valid data present in them at that time may not be reliably accessed or recovered. Such reduced reliability of memory systems resulting in loss of data can be catastrophic to overall computer system performance or operation.
One type of non-volatile flash memory used in hybrid memory systems is NAND flash. NAND flash devices are available from several vendors and all share a similar architecture. Vendors of flash memory include Samsung™, Micron™, Hynix™, and Toshiba™.
With reference to FIG. 1, the architecture of a NAND flash device is as follows:                Each device is made up of (X) number of data blocks BL, which may be for example 8,192 in some applications. In FIG. 1, the memory device 100 is, for illustrative purposes only, comprised of 12 blocks (X=12);        Each block BL is composed of (Y) number of pages P. Y may be 128 in some applications. Y=4 in FIG. 1—that is, 4 pages P per block, for a total of 4×12=48 pages, again for illustrative purposes only; and        Each page is composed of (Z) number of bytes B, which may be for example 8,228 in some applications (4 schematically shown in FIG. 1, for illustrative purposes only).        
The NAND flash device is programmed/written in units of pages. The NAND flash device is erased in units of blocks. If an uncorrectable error occurs in any page of a given block, the entire block is marked invalid.
Correctable and uncorrectable errors can be detected by the use of an error detection and correction algorithm. To insure data integrity, the data in the NAND flash device is typically protected by an error detection and correction algorithm.
One common approach to non-volatile memory management by memory systems, such as hybrid memory systems, is to mark blocks of memory invalid when uncorrectable errors are encountered under various circumstances. In particular, depending on the phase of operation—whether an error recovery can be performed—the error may or may not result in the loss of system data. Once an uncorrectable error occurs, a block is marked invalid.
With reference to FIG. 2, in general, many error detection and correction algorithms have the following properties:                The algorithm operates on a segment of (n) symbols        A symbol is specified by a number of bits (s)        The algorithm computes and adds (2t) parity symbols to a set of (k) data symbols to create the segment (n): n=k+2t (symbols)        The algorithm can detect (2t) symbol errors in (k) data symbols        The algorithm can correct (t) symbol errors in (k) data symbols        
If the number of errors in a segment of (n) symbols exceeds the number that can be corrected (t), the error is uncorrectable, and the original data cannot be recovered.
There is a need for increasing the ability to control the reliability of non-volatile memory systems by determining or detecting when an uncorrectable error may occur and thus provide an early warning about the reliability of a portion of the non-volatile memory, for example a block or a page. Furthermore, there is a need to determine when there are too few non-volatile memory blocks to backup all of the specified or required data. Moreover, there is a need to provide the computer system or the end user a programmable capability to configure and customize certain thresholds for various parameters based on the desired reliability for the overall computer system, a particular application usage, or user level of risk tolerance for data errors.