Host-based replication, or “HBR,” is a technology that enables the efficient copying of virtual machine (VM) data from, e.g., a computing deployment at a first site (referred to as the “primary site”) to another computing deployment at a second site (referred to as the “secondary site”). When a VM is replicated using HBR, the VM can be quickly restored from its replica copy at the secondary site in the case of an event (either planned or unplanned) that causes the original VM instance at the primary site to become unavailable.
FIG. 1 depicts a diagram illustrating a conventional HBR workflow 100. In this example, the VM being replicated (i.e., VM 102) runs on a host system 104 via a hypervisor 106 at a primary site 108, and the persistent data for VM 102 is stored in a virtual disk file (VMDK) 110 maintained in a storage tier 112 at site 108. The replication target for VM 102 is another VMDK 114 maintained in a storage tier 116 at a secondary site 118. Secondary site 118 is connected to primary site 102 via a wide-area network (WAN) 120.
At steps (1) and (2) of workflow 100 (reference numerals 150 and 152), during runtime of VM 102, a HBR filter 122 executing within hypervisor 106 intercepts, from VM 102, I/O writes destined for VMDK 110 and keeps track of the unique file blocks that are modified by the writes. HBR filter 122 performs this tracking for a period of time that is configured for VM 102, referred to as the VM's recovery point objective (RPO).
At steps (3) and (4) (reference numerals 154 and 156), once the time interval corresponding to the RPO is close to being passed, HBR filter 122 retrieves all of the modified file blocks from VMDK 110 and transmits the blocks, over WAN 120, to a HBR server 124 running on top of a hypervisor 126 of a host system 128 at secondary site 118. Upon receiving the modified file blocks, HBR server 124 identifies another host system at secondary site 118 (i.e., host system 130) that is capable of writing the file data to storage (step (5), reference numeral 158). HBR server 124 then copies, via network file copy (NFC), the modified file blocks to a NFC server 132 running within a hypervisor 134 of identified host system 130 (step (6), reference numeral 160).
Finally, at step (7) (reference numeral 162), NFC server 132 receives the modified file blocks from HBR server 124 and commits the blocks to VMDK 114 on storage tier 116, thereby bringing this replica copy up-to-date with original VMDK 110 at primary site 108. It should be noted that while steps (5)-(7) are occurring at secondary site 118, HBR filter 122 will begin executing steps (1)-(4) again for the next RPO time period, and the entire workflow will repeat. In this way, changes to VDMK 110 will be tracked and replicated to secondary site 118 on an ongoing basis.
While the conventional HBR workflow of FIG. 1 is functional, one inefficiency is that the workflow does not compress any of the data sent over the wire between primary site 108 and secondary site 118. This lack of compression is suboptimal since the majority of data transferred via HBR (i.e., VMDK updates) is highly compressible. It is possible to implement dedicated network devices, such as WAN accelerators, between sites 108 and 118 that are configured to compress disk data at the point it leaves primary site 108 and then decompress the data before being received at secondary site 118. However, there are several scenarios where the use of such WAN accelerators may not be possible or desirable.
For example, if primary site 108 and secondary site 118 are part of the same local area network (e.g., located within the same building or campus), typical WAN accelerators cannot be used because there is no WAN separating the sites. As another example, if the organization managing sites 108 and 118 is cost-sensitive (or needs to manage a large number of such sites), the organization may not want to incur the operational and maintenance costs associated with WAN accelerators or other similar network devices. As yet another example, if primary site 108 and secondary site 118 are managed by two different organizations, it may be difficult to ensure that the WAN accelerator operating at the egress point of the primary site (and compressing outgoing data) is compatible with the WAN accelerator operating at the ingress point of the secondary site (and decompressing incoming data). For instance, if the two WAN accelerators are sourced from different vendors, they may be configured to perform their respective compression and decompression routines using incompatible algorithms.