As most computer users know, data storage is of paramount importance. Different forms of data storage devices have been developed to address different needs. For example, some data storage devices are optimized to allow very rapid read and write accesses, so as not to present a bottleneck to other processing operations involving the data being read from or written to the storage device. Usually, these high speed read/write storage devices can only accommodate a limited amount of data and/or are expensive. Other storage devices are designed to accommodate large volumes of data (e.g., terabytes of data), but operate at much slower speeds. Such devices are usually intended for applications where the cost of high speed storage devices is not justified.
A popular form of storage system is one that uses rapidly rotating disks coated with a magnetic material to store data in the form of magnetically encoded information elements. These so-called hard disk drives (HDD), or simply hard disks, are found in many personal computers and dedicated storage appliances. Hard disks can offer significant available storage space (e.g., on the order of terabytes), but the speed at which data can be read from such devices is limited by physical properties such as the size of the disk(s) on which the data is stored, the speed at which the disk(s) rotate, and the time required for the read head to be maneuvered into the correct position to read the requested data information elements (the so-called seek time).
So-called solid state storage devices, typically those that employ flash memory arrays as the storage medium, offer improved read times, in part because there are no moving parts associated with such a device. Write times, however, are often worse than those associated with hard disks because flash arrays can only be written in relatively large “erase block” sizes (e.g., typically 128 KB-512 KB), which must be erased and rewritten in their entirety even if only a small amount of data within the block needs to be updated.
To address the inefficiencies inherent with writes to a flash array, flash memory controllers typically employ a process known as write coalescing. This allows the flash controllers to deliver acceptable performance for random writes (i.e., writes to random, non-sequential addresses within the flash array). Write coalescing uses principles that were first developed for log structured file systems. In essence, this technique bundles together or coalesces a group of random writes so that the data associated with those applications is written to a physically contiguous region of flash memory, called a “segment” (in flash, a segment should be an integral multiple of the erase block size).
An associated process performed by the flash controller, known as “garbage collection”, ensures that large segments of the flash array are kept available for the contiguous writes required for proper write coalescing. As an application updates data at arbitrary logical addresses, and those data blocks are written to new physical locations in the flash array, any preexisting versions of the data in previously written portions of the array are marked as “obsolete”, meaning that these versions are no longer needed. Note, the data blocks referred to immediately above are best understood as units for writing to the flash and are different than the erase blocks referred to previously. These data blocks are typically much smaller than the erase blocks, e.g., on the order of 4 KB-8 KB, depending on the flash controller. Herein, the term block, when used by itself, should be understood as referring to these data blocks. The term erase block will be used when referring specifically to erase blocks.
The obsolete blocks tend to be scattered about the flash array, due to the nature of the random updates performed by the application making use of the data, but a garbage collection routine running on the flash controller periodically regenerates entire segments by copying non-obsolete blocks of data in previously written segments of the array into a smaller number of new segments and then erasing the old segments.
Today, new forms of storage devices that employ both flash memory and hard disks are being marketed. In some instances, the flash memory portion of these devices is being used as a cache for data stored on the hard disk. A cache is generally regarded to be a storage area that holds a subset of the data stored on a larger, generally slower, device. Here, the flash memory cache provides faster read access than the hard disk and so data stored in the cache portion of the device can be delivered more rapidly than if the data had to be accessed from the hard disk. Of course, while a flash memory-based cache offers advantages for reads, the problems inherent with random writes must still be addressed.