Generally, massive storage systems are used to store large quantities of objects in a network environment These storage systems are typically designed to handle many billions of objects and tens to hundreds of petabytes of data. These storage systems may be implemented in datacenters, storage pools or storage clusters. As time passes and storage hardware degrades, the quality of the stored objects may degrade, and the objects may become corrupted.
In order to combat this data corruption, a storage system may store redundant copies of an object in the same or redundant datacenters. When the storage system detects a corrupted object, it may repair the object by, for example, replacing the corrupted object with an uncorrupted copy. As redundancy goes up, the data durability promise of a database/datacenter increases.
In storage systems, corrupted objects may be detected by reading and validating the objects. Attempting to read a corrupted object may result in an error, such as read error or parity/checksum/signature mismatch, and an error handler associated with the read activity can react as needed. For objects that are frequently accessed (read), the storage system may easily be kept apprised of that object's quality. However, many objects stored in the storage system may go unread for extended periods of time leaving these unread objects susceptible to silent data corruption. If silent data corruption is left undetected or unattended for extended periods of time, the corruption issues may become too numerous or too extensive to repair. For example, all the redundant copies of an object may become corrupted over a period time. Additionally, the build-up of undetected silent data corruption cases may cause the storage system to fail a service level agreement (SLA) between a client and a data storage service provider. Therefore, storage systems may include mechanisms to scan, check, and preserve the integrity of stored objects.