A Solid State Drive (SSD) typically includes a volatile buffer to buffer data from the host computer system prior to committing the write data to a non-volatile memory, such as a flash memory array. In a write back cache implementation of a SSD, the volatile buffer acts as a cache memory where data is always first written to the cache, and only later propagated to the flash memory.
A host typically makes requests in multiples of a logical block, which has a size that is small relative to a physical page of flash memory. For example, a logical block may have a size of 512 bytes. A physical page of flash memory typically has a much larger size, such as a 4K physical page, although a larger page size is also sometimes used. The volatile buffer permits incoming units of host data to be aggregated and written in larger data units (e.g., a page size) to the flash memory.
Typically a flash translation layer (FTL) is provided to emulate a traditional disk storage device that has a block device interface. The FTL manages logical-to-physical device mapping information to provide a block device interface to the host. Logical block addresses are converted by the FTL to logical flash page addresses and further to physical page addresses.
Modern FTLs in enterprise SSDs often implement a 4K design, in that the smallest unit of host data that can be localized on the solid state drive is 4 kilobytes (henceforth referred to as a 4K FTL, and a FTL slice). Such a FTL slice may represent only a small portion of a total page size. For example, many flash designs, implement a page size (where a page size is a smallest programmable unit) of 16K or larger (e.g. 32K for a multi-plane write). Thus, to efficiently utilize a flash page, the FTL must buffer data that the host has written in a volatile memory buffer until sufficient data has been aggregated to commit a full flash page.
This presents a challenge to the FTL in terms of a tradeoff between performance and data integrity. Consider the situation of so-called dirty data (‘dirty data’ being a term used to refer to data in a cache memory which has been changed or modified but where this change has not yet been propagated to main memory, which in an SSD would be the non-volatile flash memory array). For example, there may be a sequence of host write commands directed to the same cache location such that the host may attempt to overwrite dirty data in the cache. Suppose that a host write command arrives and the host writes all or part of the same FTL slice while data is still being buffered in volatile memory. In this situation, suppose that there is no non-volatile copy of the data stored when a second host command arrives. In this situation the host is attempting to overwrite dirty data in the cache.
There are a few common options used in the industry for the FTL to handle the situation of dirty data. The first approach is to transfer the data onto the same buffer location. This approach has the benefit of being the most efficient in terms of latency, but potentially compromises the integrity of host data. However, this loss of data integrity is deemed unacceptable in many applications. The second approach is to flush the dirty data to media (i.e., to the flash memory) before accepting the new host data. This approach has the benefit of ensuring the integrity of host data, but introduces inconsistent command latencies and is an inefficient use of flash.
Consider the following scenario in which there are three versions of host data “A.” A is the oldest copy that is safely stored in non-volatile storage, A′ is the data written by the host that is dirty and buffered in volatile memory, and A″ is the data written over A′ before A′ has been committed to non-volatile media. There are two common options employed to respond this scenario, each of which has significant problems. The two options result in a choice between ensuring the integrity of dirty data or maintaining host performance
First, one option is that the FTL can transfer A″ into the same volatile buffer location where the current dirty data A′ resides. This is the most efficient approach, but runs the risk of corrupting A′ if the transfer of A″ experiences an error. If an error occurs, A′ has now been corrupted and must be discarded. In this situation, the host would expect the SSD to return A′ on a read, but instead would receive A because A′ no longer exists.
Second, another option is that the FTL can flush A′ out to media (i.e., the flash memory) before initiating the transfer of A″. However, the buffer location for the data may correspond to an individual slice, such as a 4K slice in a buffer sized to aggregate a full page of data. If the flush is performed without aggregating an entire page, it results in inefficient operation. Moreover, this is a very inefficient approach because the command for A″ must now wait for a complete page program before the transfer can begin. While this second solution ensures data integrity, it also creates command latency spikes which are unacceptable for enterprise computer system applications. Further, this results in an inefficient use of flash because a full page must be written for potentially only a single FTL slice (e.g., 4K) worth of data.