In data storage systems, asynchronous replication may be implemented by identifying groups of independent writes submitted from external server systems at a production site, forming batches of these writes, and applying these batches in sequential order at a disaster recovery (DR) site. Writes are considered independent if they are submitted to the storage system within the same time frame, i.e. none of these writes is completed back to the server before another write in the batch is submitted to the storage system. By applying batches serially, data consistency of the volumes at the DR site is maintained at all times.
For performance reasons, on a distributed storage system the set of independent writes in a batch must be pessimistic, i.e. fewer writes are considered independent than may be the case. This means that each batch may contain a small number of writes only.
At the DR site, which may also be a distributed storage system, the writes in a batch are typically applied to many storage volumes across the storage system. Consequently, the serial processing of batches must be coordinated across the entire distributed storage system, through messaging between the different elements of the distributed system to ensure data consistency. The batches may be identified using some form of serial number, as for instance is disclosed in U.S. Pat. No. 8,468,313 B2.
The throughput of such asynchronous replication is therefore limited by the messaging within the DR site; the number of batches per second that can be processed is limited by the number of batch synchronization messages that can be sent and received at the same time. The performance of the asynchronous replication is also negatively affected by any delay to any write the DR site is processing as part of a batch, as the serialization of batches ensures that the next batch cannot be started until all the writes from the batch in progress are complete. Delays to writes can happen for many reasons: when a multitasking operating system is used within the storage system, writes may be delayed for hundreds of milliseconds just from task preemption.
Such a design thus limits maximum throughput. As there is little scope to catch up, such asynchronous replication designs usually become unstable following any delays to any of the writes.
One approach to solve this problem is to group many batches together and apply them atomically. This may be done using an incremental point-in-time copy. Such a process requires additional storage, and tight integration between the replication, point-in-time copy, and any caching features. It also requires coordination of triggering many point-in-time copies across the whole distributed DR storage system. This means that there will be significant periods during which no further replication writes may be processed because the point-in-time copies are starting. These new replication writes then have to be absorbed somewhere in additional storage, or else new writes from the external server systems at the production site will have to be delayed. It is clear that this is far from ideal from a performance perspective.