Computing systems generate information. It is known in the art to store such information using a plurality of data storage media. It is resource inefficient, however, to store redundant data.
Data deduplication, sometimes referred to as “intelligent compression” or “single-instance storage,” is a method of reducing storage needs by eliminating redundant data. Only one unique instance of the data is actually retained on storage media, such as disk or tape. Redundant data is replaced with a pointer to the unique data copy. For example, a typical email system might contain one hundred instances of the same one megabyte (MB) file attachment. If the email platform is backed up or archived, all hundred instances are saved, requiring one hundred MB of storage space. With data deduplication, only one instance of the attachment is actually stored; each subsequent instance is just referenced back to the one saved copy. In this example, a hundred MB storage demand could be reduced to only one MB.
Data deduplication offers other benefits. Lower storage space requirements will save money on disk expenditures. The more efficient use of disk space also allows for longer disk retention periods, which provides better recovery time objectives (RTO) for a longer time and reduces the need for tape backups. Data deduplication also reduces the data that must be sent across a WAN for remote backups, replication, and disaster recovery.