Distributed storage systems such as Ceph, use data replication and/or erasure coding to ensure data availability across drive and storage node failures. Such distributed storage systems can use Solid State Drives (SSDs). SSDs have advantages over more traditional hard disk drives in that data access is faster and not dependent on where data might reside on the drive.
SSDs read and write data in units of a page. That is, to read any data, a whole page is accessed; to write any data, an entire page is written to an available page on the SSD. But when data is written, it is written to a free page: existing data is not overwritten. Thus, as data is modified on the SSD, the existing page is marked as invalid and a new page is written to the SSD. Thus, pages in SSDs have one of three states: free (available for use), valid (storing data), and invalid (no longer storing valid data).
Over time, invalid pages accumulate on the SSD and need to have their states changed to free. But SSDs erase data in units of blocks (which include some number of pages) or superblocks (which include some number of blocks). If the SSD were to wait until all the pages in the erase block or superblock were invalid before attempting to erase a block or superblock, the SSD would likely fill up and reach a state wherein no blocks were free and none could be freed. Thus, recovering invalid pages can involve moving valid pages from one block to another, so that an entire block (or superblock) can be erased.
Erasing blocks or superblocks is time-consuming, relative to the time required to perform reads or writes. Further, part or all of the SSD can be unavailable when a block or superblock is being erased. Thus, it can be important to manage when SSDs perform garbage collection. If all SSDs in a distributed storage system were to perform garbage collection at the same time, for example, no data requests could be serviced, rendering the distributed storage system no better (albeit temporarily) than a system with data stored locally and undergoing garbage collection.
A need remains for a way to minimize the impact of garbage collection operations on a distributed storage system.