Data replication techniques enable organizations to protect data from loss, implement disaster recovery, or to migrate data between locations. There are various types of replication modes that can be utilized by an organization, and each mode comes with its own advantages and disadvantages.
One popular mode of data replication is active/active replication in which a network of servers and applications concurrently perform input/output (IO) operations across a virtualized storage layer. In active/active replication, storage nodes in two independent arrays can present to servers as a single storage object with two paths, and multiple nodes in a server cluster can write to both arrays concurrently where updates from either side are synchronously replicated to peers. This type of replication provides advantages such as continuous availability, as replication operations are not interrupted when one system or node in the network goes down.
However, due to various issues such as communication failures or software/hardware issues, the two sides in the replication session may lose connection with each other, and the data between them may become out of sync. Other issues include a locking conflict that leads to a cluster service stop. It is important to be able to recover from these failure events and bring the two sides back in sync with minimum cost during regular runtime as well as during recovery.