Data deduplication, the process of redundant data elimination, is becoming an important technology deployed in storage systems. Deduplication allows reduction of the required storage capacity because only each unique data portion is stored. In a typical configuration, a disk-based storage system, such as a storage-management server or VTL (virtual tape library) has the capability to detect redundant data “extents” (also known as “chunks”) and reduce duplication by avoiding the redundant storage of such extents. For example, the deduplicating storage system could divide file A into chunks a-h, detect that chunks b and e are redundant, and store the redundant chunks only once. The redundancy could occur within file A or with other files stored in the storage system.
Known techniques exist for deduplicating data objects. However, existing deduplication solutions do not allow sharing of data chunks generated by a deduplicating operation that has executed on either of the source or the target. Customers are forced to either deploy an inefficient and incomplete deduplicating appliance, or deploy deduplication on two products that cannot share deduplicated data.