The present invention relates to a method and system for efficiently storing a stream of incoming data into a flash memory device, and, more particularly, to a method and system for managing copy operations required for the purpose of such storing.
U.S. Pat. No. 7,149,111 to Lasser et al., entitled “Method of Handling Limitations on the Order of Writing to a Non-volatile Memory”, and incorporated by reference for all purposes as if fully set forth herein, discloses a method and system for storing a stream of incoming data, such as a digital audio stream or a digital video stream, such that it is not practical or not desirable to write the incoming data immediately into their target location in the memory device. Instead, the method and system of Lasser et al. first store the incoming data into a first location, and later retrieve the data from that first location and write them a second time, this time into their desired target location.
The system of Lasser et al. writes the incoming sectors twice into the non-volatile memory. Therefore, that system must move each sector from its first location to a second location. The time at which such moving is to be done is flexible—the moving may be done close in time to the first writing, or the controller of the memory device may wait for a relatively long time before doing that moving. Still, it is always required, sooner or later, to move each sector of data from its first location to its second location. Following that move, the storage area used for the first writing is no longer needed and can be reclaimed to be used for new data.
It should be noted that the terms “first location” and “second location” do not necessarily refer to physically separate areas of the storage device. The first and second “locations” of different sectors may be intermixed, with no clear boundary between an area used only for first writing and an area used only for a second writing. Furthermore, a unit of storage may serve as a first location for some sectors at one point in time, and as a second location for some other sectors at a second point in time. Typically, however, it is convenient to group “first locations” and “second locations” together, as it simplifies their management. For the sake of simplicity, the explanations below assume such grouping is employed, but this should not be taken to limit the scope of the present invention in any way.
One problem facing the implementer of a memory device such as of Lasser et al. is how to tell that a certain sector written in a first write operation already has been moved to its second location, so that the physical location where the sector first was written can be reclaimed for new use. If the physical unit of data copied while moving the data from their first location to their second location had been an erase block (the smallest chunk of storage that can be erased in a single erase operation, typically 16 Kbytes to 256 Kbytes), then the solution would have been simple—immediately following the copying of a unit of data into its final location the controller of the memory device can erase the unit containing the first copy. Unfortunately, this is not the case. The typical unit of data copied while moving the data between their first and second physical locations is a sector (the smallest chunk of data exchanged between a host and a storage device, typically 512 bytes) or a page (the smallest chunk of physical storage that can be written in a single write operation into the storage device, typically 512 bytes to 2 Kbytes) or a small number of sectors or pages. When a chunk of data is copied to its new place, it is typically the case that other chunks of data in the same erase block are still not copied and therefore are still required to be kept. Erasing the block containing the copied data would destroy those un-copied chunks of data, and therefore should not be done.
In the explanations below it is assumed (for the sake of simplicity) that data are exchanged with the host and copied between physical storage locations in chunks equal to sectors. This in no way limits the scope of the present invention, which is fully applicable to all sizes of data chunks, including the case of variable sizes in which successive operations use different data chunk sizes.
Prior art flash management systems usually employ a technique of “delete marks” to mark chunks of data as no longer valid and/or no longer required to be kept. Such a mark is a logical flag of typically a single bit or a few bits, located in the overhead area associated with the sector data. For example, when a sector stored in a flash memory device is over-written by new data received from the host, the flash management system, in addition to storing the new data in a newly allocated location, also writes a delete mark into the overhead area associated with the old copy of the sector. This delete mark indicates that the data stored in that sector are not valid any more and can be erased if necessary. Typically, the flash management system checks (either periodically or upon a need for more free space) whether all the sectors of an erase block are marked as deleted. If this is the case, the block contains no useful data and is erased and reused for new data. One example of a flash management system employing this method of marking over-written sectors is taught in U.S. Pat. No. 5,404,485 to Ban. An example of a flash management system using the overhead area provided for each page in a NAND flash storage device (called “extra area” or “spare area”) for storing control and management fields is taught by Lasser in U.S. Pat. No. 6,678,785. Both these patents are incorporated by reference for all purposes as if fully set forth herein.
One would expect that this method of “delete marks” could be extended to provide a solution to the problem presented above of telling whether data written to a first physical location had already been copied to a second physical location. Upon copying a sector of data, the flash management system would write a “copied mark” into the overhead area associated with that sector, and when all the sectors in a block are found to have this mark, the block can be erased and reused. This method of “copied marks” is referred to herein as the “marks method” or the “marking method”.
Unfortunately, there are two principal disadvantages to using this marking method. The first disadvantage has to do with write performance. In NAND flash devices, which are the most common flash type for data storage, the data can only be written in pages. Even if one wants to write a single byte of data, it takes the device exactly the same time to write the single byte as when writing the full page. It is true there is some saving of time in the case of writing a single byte because one needs to transfer only one byte over the bus and into the device, compared to for example transferring 512 bytes when a full page is written. But in a flash device the internal write operation is typically much slower than the data transfer over the bus and into the device, and therefore the time it takes to write the mark is close to the time it takes to write the full sector of data. So by having to add a write operation of a mark for each sector of data, we spend significantly more time per each sector stored.
Consider the following numerical example. Assume we are using a NAND flash device with pages of 512 bytes. Assume the write time of a flash page is 200 microseconds, the read time of a page is 15 microseconds, and the transfer of the full page over the bus into or out of the device takes 30 microseconds. Without using the marks, each sector of data is first moved over the bus into the flash device, then written into a first location, then read back over the bus, then moved again over the bus into the device, and finally written into a second location. In all, this sums to two write operations, one read operation and three bus transfers=2×200+1×15+3×30=505 microseconds per each sector stored. But when a mark is to be written into the first location after the above sequence in order to indicate the sector was already copied, we spend additional 200 microseconds, bringing the total number to 705 microseconds (and this while ignoring the small amount of time it takes to move the mark over the bus). So we see that there is a significant increase in the time spent per sector stored when marks are used. In this example it is approximately 40% slower to store a sector when using marks. If we take a Multi-Level Cell (MLC) NAND flash which has slow write time close to 1 millisecond (and assuming all other parameters remain the same), the effect of adding the marks is to increase the time per sector stored from approximately 2100 microseconds to approximately 3100 microseconds, almost 50% slower. This is a great disadvantage of the marks method.
The second major disadvantage of the marks method has to do with Partial Page Programming (PPP). PPP is a characteristic of a flash device that determines how many write operations are allowed to be made into a page before the block containing the page has to be erased. Typical values in commercially available NAND flash devices are between three and eight. However, MLC NAND devices have a PPP value of only one, which means we can only write once into a page before we have to erase the unit containing it. A flash management system may need to use some write operations for writing control fields associated with pages and erase blocks of a flash device where those fields are required for supporting the management algorithms, and therefore a flash management system may consume some of the available PPP operations.
Using “copy marks” consumes one write operation for each page used for the first storage of incoming sectors. This means there is one less write operation available for the flash management algorithms, and this puts limitations on the type of algorithms that may be employed. But it is even worse for flash devices having a PPP of one, such as MLC NAND. As stated above, such devices allow only a single write operation into a page. Therefore it is simply impossible to use the marks method in this case—the single allowed write operation must be used for writing the stored data, and when the time comes later (after the data were copied) to mark the page as copied, it is not allowed to perform the write operation for writing the mark.
We therefore conclude that the marks method for identifying pages as already copied from their first physical locations to their second physical locations has great disadvantages, and in some cases (i.e. MLC NAND) is not even possible to use.
There is thus a widely recognized need for, and it would be highly advantageous to have, a method that can identify copied sectors of a nonvolatile storage device without degrading the write performance of the memory device and without consuming an additional write operation.