The present invention relates generally to the field of data disaster recovery, and more particularly to replicating data to a primary server when recovering data.
A relationship between a primary server and a secondary server is considered broken when there is a prolonged (and possibly permanent) failure of any component of the replication relationship. The following component failures will cause a termination of replication relationship: 1. When a prolonged network outage occurs, updates at the primary can no longer be pushed to the secondary in a timely fashion to satisfy Recovery Point Objective (RPO); 2. When the secondary server fails or is unreachable in other ways (besides link failure), the primary will detect the loss of connectivity similar to case; 3. When the primary site fails, this will be recognized by the applications and administrator. At this point, the relationship is automatically terminated, and a manual intervention is required.
Once a relationship is broken, reestablishing it depends on the type of failure experienced. Failover is when the secondary site fails and a new secondary must be setup. A new relationship needs to be setup between the existing primary and the new secondary and all the primary data populated at the new secondary. Failback is when the primary site fails, the secondary is upgraded to (Acting) Primary (read-write) and has updates. Once the old primary comes back, the relationship needs to be reestablished. This requires replaying the new updates at the secondary (Acting Primary) back at the primary site. On a network reconnect after outage or availability of secondary after a failure, changes made at the primary since the last synchronization occurred need to be replayed.
In a setup for disaster recovery where a primary server has failed and is back online or recovered after failure, snapshot data created on a secondary server (acting primary) in absence of an actual primary needs to be synced on an actual primary server. The time required for this movement of snapshot data can be significant because the transfer time depends on the tier on which data is currently present.