A solid-state drive (SSD) is a data storage device that uses non-volatile solid-state memory to store persistent digitally encoded data. A solid-state drive can be configured to emulate a hard disk drive, i.e., a device that stores persistent digitally encoded data on the magnetic surfaces of rapidly rotating platters, and can be used to replace a hard disk drive in many applications.
A data-source/data-sink device writes data to, and reads data from, a persistent storage device. Logical block addresses are used to identify storage locations within the persistent storage device. The logical block addresses indicate which data is to be written and/or to identify storage locations within the persistent storage device from which data is to be read. Such logical block address (LBA) based addressing schemes are generic in nature and do not take into account considerations specific to the respective persistent storage device in use. Therefore, a persistent storage device, whether it is a hard disk drive, or a solid-state drive, typically includes an interface that translates received logical block addresses to storage device specific addresses, and vice versa.
In the case of a solid-state drive, the smallest addressable unit of memory is called an access unit. For example, a single solid-state drive access unit address may point to a physical location of solid-state memory that can store, for example, 4 kilobytes (KB) of data. Therefore, an SSD external interface must translate a logical block address (LBA) based address received in a write instruction or a read instruction to a SSD access unit based address before the received instruction can be acted upon. Further, in the case of a read instruction, the solid-state drive formats data retrieved from the respective access units in a manner consistent with the LBA-based addressing scheme prior to returning the retrieved data to a requesting device.
Solid-state drives do not have a spinning magnetic platter or actuator arm as used in hard disk drives. Therefore, solid-state drives are more rugged than hard disk drives and do not have the same operational delays. For example, unlike a hard disk drive, in a solid-state drive, there is no seek time associated with moving the actuator arm over a cylinder associated with a logical block address. Further, unlike a hard disk drive, in a solid-state drive there is no latency associated with rotating a platter to a physical sector/block of data beneath the actuator arm once the actuator arm has positioned itself over a cylinder associated with a logical block address. Because there is no such mechanical movements or delays, solid-state drives typically enjoy low access time, low latency, and low power consumption.
In addition, a solid-state drive typically includes multiple I/O channels that can be operated concurrently, i.e., in parallel, to write data to, or read data from the solid-state memory included in the solid-state drive. Each I/O channel is capable of independently writing and/or reading data from addressable access units that are assigned to the channel, one addressable access unit at a time. Therefore, a write/read strategy that stores data across access units that are assigned to different I/O channels can reduce the latency of write operations by using multiple I/O channels in parallel to write the data to solid-state memory, and can reduce the latency of read operations by using multiple I/O channels in parallel to read the stored data from the solid-state memory. Using such an approach, data received via a high-speed external interface from an external data-source/data-sink device, can be quickly written to the solid-state memory included in the solid-state drive. Further, data requested by an external data-source/data-sink device, can be quickly retrieved from the solid-state memory included in the solid-state drive via the parallel I/O channels, and delivered to the external data-source/data-sink device via the high-speed external interface.
However, one drawback associated with solid-state drives is that once an access unit has been written to, the same access unit cannot be written to again until the access unit has been erased. Further, an SSD erase operation requires more time than is required for an SSD access unit write operation and more time than is required for an SSD access unit read operation. In addition, many solid-state drives cannot erase a single SSD access unit, but must erase a group of multiple SSD access units that form a smallest unit of memory that can be erased by the solid-state drive, referred to as an SSD erasable unit. Therefore, not only will an SSD erase operation introduce significant operational delay, an SSD erase operation may require valid data from surrounding SSD access units within the same SSD erasable unit to be moved prior to the erase operation, or be lost.
As a result of the heavy delay penalties associated with erasing SSD access units, solid-state drives do not execute overwrite instructions in the same manner as a hard disk drive. For example, in response to an instruction from an external data-source/data-sink device to overwrite a portion of previously stored data, a solid-state drive may, instead of erasing and re-writing data in a set of SSD access units, write the modified data to a new set of one or more SSD access units and may then mark the previously written SSD access units as being invalid. Although such an approach requires the use of additional SSD access units, the approach avoids delays that would otherwise be associated with erasing and then rewriting the previously written SSD access units. Further, once a sufficient number of invalidated SSD access units have accumulated, a periodic maintenance may be performed during idle processing cycles to erase and reclaim the invalidated SSD access units for reuse.
The approach used by a solid-state drive to perform data overwrite/modify operations can significantly impact the efficiency with which data can be retrieved from the solid-state drive in response to a read operation. A segment of data stored to a sequential series of access units is distributed evenly across the respective I/O channels and, therefore, can be accessed with a large degree of parallelism. Limited only by the number of I/O channels supported by the SSD, each access unit is concurrently accessible via separate I/O channels. However, after one or more overwrite operations, the data on the SSD can become highly fragmented. A segment of data that was originally distributed evenly across the respective I/O channels, is likely less optimally distributed across the respective I/O channels. Therefore, data which could originally have been retrieved via a single read request of the parallel I/O channels must instead be retrieved with multiple sequential read requests, thereby increasing the latency associated with reading the data segment from SSD storage.
The background description provided herein is for the purpose of generally presenting the context of the disclosure. Work of the presently named inventors, to the extent the work is described in this background section, as well as aspects of the description that may not otherwise qualify as prior art at the time of filing, are neither expressly nor impliedly admitted as prior art against the present disclosure.