1. Field of the Invention
The present invention relates generally to techniques for optimizing random writes for flash disks.
2. Description of the Related Art
The recent commoditization of Universal Serial Bus (USB)-based flash disks, mainly used in digital cameras, mobile music/video players and cell phones, has many pundits and technologists predicting that flash memory-based disks will become the mass storage of choice on mainstream laptop computers in two to three years. In fact, some of the ultra mobile Personal Computers (PCs) already use flash disks as the only mass storage device. Given the superior performance characteristics and enormous economies of scale behind the flash disk technology, it appears inevitable that flash disks will replace magnetic disks as the main persistent storage technology, at least in some classes of computers.
Compared to magnetic disks, flash disks consume less power, occupy less space, and are more reliable because they do not include any moving parts. Moreover, flash disks offer superior latency and throughput because they work similar to a Random Access Memory (RAM) chip and do not incur any head-positioning overhead. However, existing flash disk technology has two major drawbacks that render it largely a niche technology at this point.
First, flash disk technology is still quite expensive, as compared to magnetic disks.
Second, flash disk performance is better than a magnetic disk when the input workload consists of sequential reads, random reads, or sequential writes. Under a random write workload, flash disks performance is comparable to that of magnetic disk, at best, and in some cases actually worse. The flash disks random write performance problem is rooted in the way flash memory cells are modified, and thus cannot be easily addressed.
A flash memory chip is typically organized into a set of Erasure Units (EUs) (typically 256 Kbytes), each of which is the basic unit of erasure and in turn consists of a set of 512-byte sectors, which correspond to the basic units of read and write. After an EU is erased, subsequent writes to any of its sectors can proceed without triggering an erasure if their target addresses are disjoint. That is, after a sector is written to and before it can be written to a second time, the sector must be erased first. Because of this peculiar property of flash memory, random writes to a storage area mapped to an EU may trigger repeated copying of the storage area to a free EU and erasing of the original EU holding the storage area, resulting in significant performance overhead.
Flash disks are typically produced with a Flash Translation Layer (FTL), which is implemented in firmware. The FTL maps logical disk sectors, which are exposed to the software, to physical disk sectors, and performs various optimizations such as wear leveling, which equalizes the physical write frequency of the EUs. This logical-to-physical map requires 64 million entries in order to keep track of individual 512-byte sectors on a 32-GB flash disk. To reduce this map's memory requirement, flash disks increase the mapping granularity, sometimes to the level of an EU. As a result of this coarser mapping granularity, two temporally separate writes to the same mapping unit, e.g. an EU, will trigger a copy and erasure operation if the target address of the second write is not greater than that of the first write, because a flash disk cannot always tell whether a disk sector in an EU has already been previously written to. That is, if an Nth sector of a mapping unit is written to, any attempt to write to any sector whose sector number is less than or equal to N will require an erasure, even if the target sector itself has not been written to at all. Consequently, coarser mapping granularity further aggravates flash disks random write performance problem.