The present invention relates to flash-based memory systems, and more specifically, to methods and systems for reducing access contention in flash-based memory systems, or other memory systems exposing similar properties as flash-based memory systems.
Flash memory is a non-volatile computer storage that can be electrically erased and reprogrammed. Flash-based storage devices such as solid-state drives (SSD) have hardware characteristics in which reads and writes are typically performed in page-sized chunks, typically 2 to 4 KB in size. Erases are typically performed in full blocks where a block typically includes of 64 to 128 pages. Flash memory includes both NOR types and NAND types. Generally, there exist two different types of NAND flash chips: the type based on single-level cells (SLC) store one bit and the type based on multi-level cells maintain multiple voltage levels in order to store more than one bit. A 4 KB page in an SLC-based flash chip has typical read and write times of 25 and 600 μs, respectively. Erasing a full block takes significant longer amount of time and can take 7 ms in enterprise-grade flash chips. These read/write/erase characteristics are valid irrespective of the workload. In contrast, in hard disk drive (HDD)-based storage systems, the seek time limits the random access performance. However, in flash-based storage devices cells must first be erased before they can be programmed (e.g., written). Therefore, the common technique used to hide the block erase latency is to always write data out-of-place and erasing of blocks is deferred until garbage collection is initiated. When an erase command is issued, the chip is busy until the operation completes and there is no way to read or write on this chip during this time, which is referred to as “blocking erase”. The out-of place write strategy requires a special layer called the Flash Translation Layer (FTL), which maintains the mapping between logical block addresses (LBA) and the actual physical page/block addresses (PBA) in the Flash memory.
Access time to the flash-based storage device can still expose delay variations. FIG. 1 illustrates a plot 100 of probability density function versus latency for a simulated prior art system. FIG. 2 illustrates a plot 200 of cumulative distribution function versus latency for a simulated prior art system. FIGS. 1 and 2 illustrate a flash simulator in a high load scenario. In the example, when a block is being erased, a subsequent read request on the same chip might have to wait up to 7 ms to be serviced. Similarly, a read request can be delayed up to 600 us by an ongoing write request. Such significant delays are in certain environments not acceptable. In addition, certain countries (e.g., Japan) even force maximum delay bounds (˜5-10 ms) for specific IT applications. Hence the potential total delays may exceed the required delay bounds. Moreover, traditional DRAM memory technologies don't have the same restrictions as flash and hence provide significant more homogeneous access delays. Flash cache solutions try to replace expensive battery-backed DRAM memory, called NVRAM, with Flash. Hence, the characteristics of such a flash cache are similar to the ones of those memory technologies. It is therefore beneficial but not trivial to provide such characteristics with Flash-based memory technologies.