Conventional approaches to replicating virtual machine images are typically a resource-intensive. Organizations replicate virtual machine images for a variety of reasons, but one notable reason is disaster recovery. Virtual machine-based computing systems in one geographic region, such as in New York City, that can be susceptible to data loss or inability to access data due to, for example, a severe hurricane or other types of disasters. In such occasions, transferring data from the affected region to another virtual machine-based computing system in another geographic region enables an organization to continue to keep its internal processes (e.g., of a business) up and running.
However, transferring data to replicate virtual machine-based computing system can involve transferring gigabytes or terabytes of data via a variety of networks, including the Internet. Creating a replica of a virtual machine requires reading the source virtual machine image block by block and transmitting copying each block to the replicated virtual machine image. This is a relatively time-consuming operation since the data sizes of virtual machine images can take many hours to complete.
Moreover, a rapidly-growing demand of virtualized systems and machines means hundreds of thousands of virtual machines may need to be deployed at different locations. Conventional solutions of replication hundreds or thousands of virtual machines is cost prohibitive and time consuming and do not scale effectively with the relatively large number of virtual machines required for deployment, even if the underlying file system of the virtual machines is deduplicated.
For example, synchronous replication techniques require the copying of data over a variety of networks to maintain up-to-date copies of the data. Generally, synchronous replication requires data to be synchronously written to different locations contemporaneously, whereby latency is introduced due to replicating to a remote location. In particular, the latency slows operation of the principal virtual machines as data is written remote virtual machines and/or storage.
Thus, what is needed is a solution for improving the cost and efficiency of replicating images of virtual machines without the limitations of conventional techniques.