It is known to replicate data from a first data storage system to a second data storage system, the second data storage system being located remotely from the first data storage system. For example, the first data storage system might provide backup, or secondary, storage for one or more host computer systems, and the second data storage system might enable data backed up to the first data storage system to be recovered to a known state in the event that data stored on the first data storage system becomes unavailable. Where the replicated data is not required to be available for immediate online restore purposes, the expense of providing fast (high bandwidth) communication links to remote sites can render replicating between the first and second data storage systems over a fast communications link uneconomical compared to known alternative methods such as the transportation of data on removable storage between sites. Therefore, slower (lower bandwidth), less expensive, communications links are sometimes used for replication, whereby the time taken to replicate a specified set of data is longer, and is generally planned to be effected within predetermined time limits, for example within eight or twenty-four hours.
Where both first and second data storage systems employ data deduplication technology, then following an initial replication session, if subsequent replication sessions contain similar data with relatively few changes, as is often the case for example with backup data, then the subsequent replication can be performed more efficiently in that only previously un-replicated data needs to be communicated over the slower link, and previously replicated data can merely be identified over the link using a small-footprint chunk identifier. However, in a situation, for example, in which data on one of the first and second data storage systems becomes unavailable, it can take an undesirably long time to replicate the data from the remaining available data storage system to a replacement data storage system over the slower communications link because the deduplicated data store of the replacement data storage system will be empty. A similar situation exists when initially replicating to a replication target.