Data replication involves replicating data located at a source (e.g., source virtualization environment) to a destination (e.g., destination virtualization environment). This may be performed for the purpose of disaster recovery, where data replicated from the source to the destination may be later recovered at the destination when the source undergoes failure.
Two modes of data replication currently exist: asynchronous data replication and synchronous data replication. Asynchronous data replication occurs where a write operation for a piece of data at a source is committed as soon as the source acknowledges the completion of the write operation. Replication of the data at the destination may occur at a later time after the write operation at the source has been committed. Synchronous data replication occurs where a write operation for a piece of data at a source is committed only after the destination has replicated the data and acknowledged completion of the write operation. Thus, in a synchronous data replication mode, a committed write operation for data at the source is guaranteed to have a copy at the destination.
Asynchronous data replication is advantageous in certain situations because it may be performed with more efficiency due to the fact that a write operation for data at the source can be committed without having to wait for the destination to replicate the data and acknowledge completion of the write operation. However, asynchronous data replication may result in potential data loss where the source fails prior to the replication of data at the destination.
Synchronous data replication guarantees that data loss will not occur when the source fails because the write operation for data is not committed until the destination has verified that it too has a copy of the data. However, having to wait for data to be written at both the source and the destination before committing a write operation may lead to latency as well as strain on system resources (e.g., CPU usage, memory usage, network traffic, etc.).
Conventionally, data replication involves setting a fixed data replication policy (either synchronous or asynchronous). By setting a fixed data replication policy, the manner in which data replication occurs remains static regardless of the changing nature of the system (e.g. source networked virtualization environment or destination networked virtualization environment). System parameters such as the amount of data being replicated or the amount of resources being consumed by the source or destination may vary over time. Thus, fixing the data replication policy for a system fails to account for the dynamic fluctuation in system parameters and may lead to inefficiencies where the system parameters change substantially or frequently over the course of system operation.
Setting a fixed data replication policy may be efficient where the source and destination operate at a steady resource consumption rate and the amount of data to be replicated remains steady. However, where the rate of resource consumption or amount of data to be replicated exhibits volatility, the fixed data replication policy may lead to the underutilization of resources when additional resources are available or where the amount of data to be replicated significantly decreases. Similarly, inefficiency may occur where the fixed data replication policy overutilizes resource availability when fewer resources are available or where the amount of data to be replicated significantly increases.