Contemporary cloud-based storage systems such as Dell EMC® Elastic Cloud Storage (ECS™) service stored data in numerous ways for data protection and efficiency. One commonly practiced maintenance service performs garbage collection to reclaim storage space that was formerly used for storing user data, but is no longer in use.
In ECS™, user data are stored in chunks. There are real chunks comprising actual data stored in the ECS™ storage, and “virtual” chunks comprising user data referenced in the ECS™ storage that actually resides in one or more other storage systems. Over time, these virtual chunks are migrated into real chunks in the ECS™ storage system via a technique known as pull migration.
Pull migration runs in parallel with normal data traffic handling, whereby the migration service needs to be throttled down. As a result, the migration process normally takes many months; for large clusters the migration process may take over a year or even multiple years to complete. Until migrated, the data referenced by the virtual chunks does not have the ECS™ storage protection schemes, e.g., including replication and erasure coding that protect user data at the chunk level.