The present invention relates generally to the field of data storage, and more particularly to maintaining data integrity in deduplicated block storage environments.
Data integrity practices help maintain and ensure the accuracy of data in a storage system. Data integrity practices include techniques to prevent data corruption. One form of data corruption is data degradation, also known as data decay or data rot. Data rot is the gradual decay of storage media over time. Data integrity practices include ensuring that the data recorded is maintained as received and, upon later retrieval, ensuring that the data is the same as it was when it was originally recorded. Data integrity practices aim to prevent unintentional changes to information.
Data deduplication is a technique used in data compression for eliminating duplicate copies of repeating data. Data deduplication is used to improve storage utilization and therefore lower storage capacity requirements for a given set of data. In the deduplication process, blocks of data are identified and stored during a process of analysis. As the analysis continues, other blocks are compared to the stored copy. Whenever a match is identified, the redundant block is replaced with a small reference that points, e.g., as a pointer, to the stored block.