Virtual computing environments allow a single physical computing node to be shared by multiple users. A hypervisor operates on each physical node and provides multiple instances of a virtual machine (VM) with a dedicated operating system (OS), file structure and applications, such that a user of the VM has the appearance that the VM is a machine dedicated to them. The actual files associated with each VM tend to form a complex arrangement that may be spread across multiple physical disks. Performing a backup involves identification of each of the files and included file blocks used by each VM, which presents a formidable computing task.
Backup vendors have attempted numerous innovations to improve these backups. Such conventional approached have relied on file timestamps or source de duplication technologies to reduce the amount of data that needs to be backup. Likewise, efforts have involved file systems that uses various de duplication technologies to discard redundant data in backup data, and ongoing reorganization of backup data in an attempt to improve efficiency of the backup.