Modern computer storage devices such as non-volatile flash memory devices and solid-state drives (SSDs) are very sensitive to the effect of data write access on latency of read access cycles. During write access cycles, read access requests to the same memory space may be blocked by a controlling module (e.g. a flash translation layer) and thus read operations performed by a requesting host may be delayed, thus increasing read access latency.
Read access latency may be distributed according to a probability distribution function P(L), where L is a value of latency, and P(L) is a probability that latency of a specific read access operation is below L. The term “tail latency” herein refers to an upper bound of latency experienced by a high percentile of read access operations. For example, P(L)=99% corresponds with a top 1 percent of read access operation latencies, meaning that 99% of read-access operations experience less than L latency.
Tail latency is a good representative of a worst-case read-access scenario and corresponds with a user's experience of the storage system. The effect of read access blocking is especially significant in relation to the memory device's Quality of Service (QOS) and tail latency.
Read access tail latency is tightly associated with foreground and background write operations that the memory device is handling. Applications associated with the storage device may be optimized to guarantee sequential write access to the storage. However, simultaneous read operations may be blocked due to background operations. While such a phenomenon may only slightly affect the average read latency, read tail latency will be increased significantly.
On the other hand, flash translation layers may take measures to reduce the randomness of data storage processes but cannot control the randomness of read operations which are driven from requesting applications, which may collide with write access operations.
Some storage devices support internal suspension of a program operation, to service read operations from the same page/block/die. However, the storage device cannot starve a program operation, so a consistent, low read tail latency cannot be guaranteed in the presence of program operations including write access operations.
Some commercially available storage devices include mechanisms for reducing read latency by duplication of stored data, but lack mechanisms for prioritizing and selecting data to be duplicated, and hence are inefficient in the consumption of storage space.
A method for minimizing read access latency, and especially read access tail latency and maintaining high QOS for read access of memory and storage devices, such as non-volatile memory devices, solid state drives (SSDs), hard disk drives (HDDs) and the like is therefore desirable.