Non-volatile memory systems, such as flash memory, have been widely adopted for use in consumer products. Flash memory may be found in different forms, for example in the form of a portable memory card that can be carried between host devices or as a solid state disk (SSD) embedded in a host device. Two general memory cell architectures found in flash memory include NOR and NAND. In a typical NOR architecture, memory cells are connected between adjacent bit line source and drain diffusions that extend in a column direction with control gates connected to word lines extending along rows of cells. A memory cell includes at least one storage element positioned over at least a portion of the cell channel region between the source and drain. A programmed level of charge on the storage elements thus controls an operating characteristic of the cells, which can then be read by applying appropriate voltages to the addressed memory cells.
A typical NAND architecture utilizes strings of more than two series-connected memory cells, such as 16 or 32, connected along with one or more select transistors between individual bit lines and a reference potential to form columns of cells. Word lines extend across cells within many of these columns. An individual cell within a column is read and verified during programming by causing the remaining cells in the string to be turned on so that the current flowing through a string is dependent upon the level of charge stored in the addressed cell.
The responsiveness of flash memory cells typically changes over time as a function of the number of times the cells are erased and re-programmed. As this generally results in the memory cells becoming less reliable, the memory cells may need higher voltages for erasing and programming as they age. The effective threshold voltage window over which the memory states may be programmed can also decrease as a result of this charge retention. The result is a limited effective lifetime of the memory cells. Specifically, blocks of memory cells may be subjected to only a preset number of Write and Erase cycles before they are mapped out of the system. The number of cycles to which a flash memory block is desirably subjected may depend upon the particular structure of the memory cells, the amount of the threshold window that is used for the storage states, the extent of the threshold window usually increasing as the number of storage states of each cell is increased. Depending upon these and other factors, the number of lifetime cycles can be as low as 10,000 and as high as 100,000 or even several hundred thousand.
Continual erasing and re-programming of data sectors in a relatively few logical block addresses may occur where the host continually updates certain sectors of housekeeping data stored in the memory, such as file allocation tables (FATs) and the like. Specific applications can also cause a few logical blocks to be re-written much more frequently than others with user data. Therefore, in response to receiving a command from the host to write data to a specified logical block address, the data are written to one of a few blocks of a pool of erased blocks. That is, instead of re-writing the data in the same physical block where the original data of the same logical block address resides, the logical block address is remapped into a block of the erased block pool. The block containing the original and now invalid data is then erased either immediately or as part of a later garbage collection operation, and then placed into the erased block pool. The result is that even when data in only a few logical block addresses are being updated much more than other blocks, instead of a relatively few physical blocks of the system being cycled with a higher rate, the cycling is evenly spread over many physical blocks. This technique is known in the prior art as “wear leveling”.
In flash memory management systems that employ self-caching, there is the question of when to schedule cache flushing operations. In cache flushing operations, a portion of the data in the cache, typically data corresponding to a common logical block, is copied from the cache to the main storage area and then removed from the cache to make room for new input data in the cache. Removal from the cache does not necessarily require an immediate erasing of the copied data, but may be accomplished by setting flags indicating that data is not needed any more so that the flagged data may be erased when the space is needed. Flushing from the cache, even if immediate erasure of the data from physical cache blocks is not required, does lead to using up a Write/Erase cycle.
Different cached systems employ different policies regarding the scheduling of cache flushing operations and regarding the selection of the specific data to be flushed. Typically, the factors effecting the scheduling decisions are how full the cache is, and whether there are access requests arriving from the host that have to be serviced. When deciding to schedule a cache flushing operation and having to select which logical block is to be selected among the multiple logical blocks that may currently have data in the cache, a consideration is how efficient is the flushing of a logical block. In this context, efficiency of block flushing means how many “parasitic” writing operations will be required for flushing the block.
As an example of the “block flushing efficiency” concept, consider a scenario where a block contains 128 pages. In this example, is assumed that the flash management algorithms of the storage system require that all physical blocks in the main storage area (excluding the cache area) must always be full and allocated to a single logical block. If all the pages of the logical block X are located in the main area, and not in the cache, then all the data of logical block X is located in a single physical block Y. Now suppose the host updated 100 pages in logical block X, and they were all stored into the cache. When now flushing logical block X out of the cache, a free physical block Z is allocated and filled with 128 pages, 100 of which are copied from the cache and 28 are copied from physical block Y. So in this example, 28 parasitic page write operations were performed that did not directly contribute to clearing area of the cache but were nonetheless needed to support the integrity of the flash management system. While the above example describes a specific and not very complex example of a flash management system, the concepts of parasitic write operations and efficiency of flushing operations are relevant for any cached flash system.
Prior art flash memory management systems that use self-caching typically use such flushing efficiency criterion in their decision process in one form or another. Prior art systems may also use block flushing efficiency to affect scheduling of flushing operations when the flash memory management system is busy with host requests and multiple logical blocks are competing for selection to be flushed. If the flash memory management system is idle (in the sense that the host does not access it) and there are a few logical blocks having data in the cache, then the blocks will be selected for flushing one by one and eventually all of them will be flushed out, leaving an empty cache. When the storage system is idle with respect to host requests, the flash memory management system will typically flush all data in the cache so that the cache is better prepared for a possible future burst of host activity. While this can be a reasonable cache flushing policy to adopt when the main concern is maintaining a short response time of the storage system to host requests, this type of cache flushing policy can create a problem with respect to the endurance and reliability of the flash storage system.
As noted above, there is generally a limit to the number of Write/Erase cycles that are supported and guaranteed by the manufacturers of flash devices. Recent generations of flash devices have brought that number of cycles down, due to the smaller dimensions of the memory cells, and due to the use of multi-bit per cell technologies that can make memory cells more sensitive to various disturbances. By applying the above cache flushing policy, where if the flash device is idle logical blocks may be flushed that have very little data in the cache, a very low flushing efficiency will result. For example, if a storage system starts with an empty cache, a host updates 10 pages of a single logical block and then the host stops for a while, the only logical block represented in the cache has 10 pages cached. If the host is now idle, this logical block will be flushed out resulting in Write/Erase cycle “spent” for absorbing only 10 pages instead of the 128 pages (again assuming a 128 page block size) that could theoretically “share” this Write/Erase cycle. If this host access pattern is a typical one, the storage system will reach its end of life (with all physical blocks cycled to their limit) after writing less than 10% of the amount of data it could theoretically absorb. Such low-efficiency flushing eats into the limited number of Write/Erase cycles of the physical blocks and shortens the lifetime of the storage system.