Disaster recovery procedures for data management enable an organization to recover data after catastrophic events such as fire, or floods, hurricanes, and other natural disasters that present risk to locally stored data. To improve reliability of disaster recovery and meet stringent recovery time objectives imposed by businesses and organizations, are increasingly replicating backups to create an offsite copy of critical data. Backup generation and replication can be made more efficient by applying data deduplication and/or compression. Deduplication and compression of replicated data sets reduces the storage and transmission bandwidth requirements for offsite backup replication. In deduplicated storage systems, multiple stored files or objects may contain and thus reference the same stored chunk of data. A chunk of data within a deduplication storage system may be referred to as a data segment. Fingerprints and other metadata are generated and maintained for stored segments of data and the segment of data is stored only once. The metadata associated with the segments are used to link the stored segments with files or objects.
As the amount of data managed by a deduplication storage system increases into the petabyte range there is a chance that the same fingerprint may be generated for two different data segments. Even though the chances of this happening are quite rare, the likelihood increase as the amount of managed data increases. To mitigate this problem, a deduplication storage device can compute a checksum of the data segment along with the fingerprint and use the checksum and fingerprint in combination to identify the data segment. However, computing the checksum of a data segment is an expensive and iterative process and requires a large amount of compute resources.