Non-volatile memory systems, such as flash memory, have been widely adopted for use in consumer products. Flash memory may be found in different forms, for example in the form of a portable memory card that can be carried between host devices or as a solid state disk (SSD) embedded in a host device. Two general memory cell architectures found in flash memory include NOR and NAND. In a typical NOR architecture, memory cells are connected between adjacent bit line source and drain diffusions that extend in a column direction with control gates connected to word lines extending along rows of cells. A memory cell includes at least one storage element positioned over at least a portion of the cell channel region between the source and drain. A programmed level of charge on the storage elements thus controls an operating characteristic of the cells, which can then be read by applying appropriate voltages to the addressed memory cells.
A typical NAND architecture utilizes strings of more than two series-connected memory cells, such as 16 or 32, connected along with one or more select transistors between individual bit lines and a reference potential to form columns of cells. Word lines extend across cells within many of these columns. An individual cell within a column is read and verified during programming by causing the remaining cells in the string to be turned on so that the current flowing through a string is dependent upon the level of charge stored in the addressed cell.
Some flash memory management systems employ self-caching architectures where data received from a host is first stored in a portion of the flash memory designated as the cache and is later copied to a portion of the flash memory designated as a main storage area. In this type of flash memory management system, there is the question of when to schedule cache flushing operations. In cache flushing operations, a portion of the data in the cache, typically data corresponding to a common logical block, is copied from the cache to the main storage area and then removed from the cache to make room for new input data in the cache. As used herein, the terms cache flushing and cache clearing are synonymous.
Self-caching flash memory management systems may employ different policies regarding the scheduling of cache flushing operations and regarding the selection of the specific data to be flushed. Typically, the factors influencing the scheduling decisions are how full the cache is, and whether there are access requests arriving from the host that have to be serviced. When the storage system is idle with respect to host requests, the flash memory management system will typically flush all data in the cache so that the cache is better prepared for a possible future burst of host activity.
The write performance of a self-cached storage device is generally specified by two numbers. The first number is the burst write performance. Burst write performance is the rate at which the storage device can absorb an input stream of data when there is enough room in the cache. Accordingly, burst write performance depends solely on the write performance of the cache, not on the write performance of the main storage area. The second number is the sustained write performance. Sustained write performance is the rate at which the storage device can absorb streams of input data that are much larger than the cache size. The sustained write performance is a function of both cache write performance and main storage area write performance. If the main storage area is much slower than the cache, then the sustained write performance is determined mainly by the main storage area write performance.
Even though a storage device is specified for some given sustained input rate, there is nothing stopping a host from sending data into the storage device at a higher rate than specified and relying on the storage device to raise a “busy” condition to delay the input stream when it is not capable of keeping with the pace, and to clear the busy status when more input can be received. This is indeed how many real-life hosts operate. The host will send data to be written into the storage device as fast as it can, and continue to do so until the storage device forces it to hold off and wait.
When following this pattern of operation, the typical observed effect will be as follows. Assuming the cache of the storage device starts out empty, the host will first see a high performance equal to the burst write performance. Gradually, the cache will be filled up, but the performance will still be the burst performance up until the point where the cache is completely full or very close to it. At this point, the storage device must raise the busy status and start clearing space in the cache by moving some content to the main storage area. Typically, the busy status will clear only after data corresponding to a full logical block is copied from the cache. As a result, the host might encounter a relatively long busy period that can be a few tenths of a second or even a few seconds if the main storage is slow and the block is large. Although the average performance seen by the host may still be within the advertised sustained write performance rate specifications, some hosts might not be able to handle such long busy periods. This is because a long busy period requires a larger buffer in the host for accumulating all the new data that might be generated in a worst case situation during the busy period.
In some cache implementations the problem may be even worse because the effective rate of clearing data out of the cache might decrease as the cache becomes fuller. This might happen when the host writes data to random addresses, rather then sequentially, and where the cache uses flash memory organized in large blocks containing many data pages. In such devices not only are the busy periods longer, but the sustained write performance may not be met when the cache continuously operates near its fullest state.
In some cases, there are also hard limits on the time the storage device may indicate a busy status and violating such limit might cause the host to abort a transaction. For example, the SecureDigital (SD) standard for flash memory requires an SD-compliant card to always respond to a host write command within no more than 250 milliseconds. If a card does not meet this strict time limitation, a host might terminate the communication session with the card and abort the data storage operation. Thus, getting into an “always full cache” mode of operation can significantly increase the risk of violating such a time limit.
Cached storage devices generally use a policy for flushing data where data is flushed from a cache on one of two conditions. The first condition is when the storage device is idle, where a storage device is defined as idle when it is not receiving data from the host or otherwise being accessed by the host. The second condition is when there is no other way to receive new data as there is no more room in the cache. Such a cache flushing policy can result in the problems explained above. Some cached storage devices provide the host with explicit control on flash operations, however it is difficult for a host to utilize such control on the cache operation to avoid the above problems. For a host to do so requires detailed knowledge and understanding of the internals of the storage device, which is information that a generic host does not have.