The present application relates generally to an improved data processing apparatus and method and more specifically to mechanisms for server based disaster recovery by marking use of dual write responses.
Replication works in two ways: synchronous and asynchronous. Block level replication happens between two storage controllers or storage virtualization appliances. Generally, a data center has a primary site and a secondary site, which is the disaster recovery site, for storing data. The primary site hosts the live data used by the servers and applications. A replication solution is implemented between the primary site and the secondary site. The replication enables failing over the access path to the replicated storage at the secondary site in response to failure of the primary storage.
Asynchronous replication does not guarantee availability of the most recent data. Generally, data loss is encountered when failing over to the secondary site. Recovery Point Objective (RPO) specifies the amount of data that the secondary site has and how much has been lost. The lower the RPO, the less the data loss at the time of a failover.
One solution for reducing RPO is to increase the frequency of the data copy from the primary site to the secondary site. The disadvantage of this solution is that the primary is busy for more time in copying data, and the host to primary communication suffers.
Another solution for reducing RPO is to procure high end storage at the primary site, similar high end storage at the secondary site, high performing switches, and high bandwidth links between the primary site and the secondary site. All of this requires significant investment, which is not an economical solution.
Yet another solution for reducing RPO is to change the replication method from asynchronous to synchronous. This will have multiple new requirements, such as an increased bandwidth link between the primary site and the secondary site, and will add latency to the server input/output (IO), because the server must wait for an acknowledgement (ACK) from both the primary site and the secondary site for each write.
Another solution for reducing RPO is to configure the volume management software on the server to create a mirror for the logical unit (LUN) on two different enclosures. The advantage in this case is high availability in case of storage enclosure breakdown. However, as a full copy is maintained on two enclosures, space efficiency is significantly lower.