In a virtualization environment, a host software (i.e., a hypervisor) running on one or more software or hardware infrastructures (i.e., a host machine) may emulate, or virtualize, the host machine for one or more guest software. In other words, a hypervisor may implement one or more virtual machines (VMs).
The hypervisor implements the VMs by loading data utilized to implement the VMs (i.e., images of the VMs) from a data storage system into a memory of the host machine. If the images are derived from the same image (i.e., a master image), the images may have a significant amount of data in common. In existing implementations, the hypervisor retrieves and loads each of the images independently into a separate region of the memory without identifying the data that is common to the multiple VM images. Thus, the hypervisor wastes resources (e.g., processing power, memory space, storage and network bandwidth) by repeatedly retrieving and loading the common data.