Consider a computer running one or more virtual machines, using virtualization software such as VMware. The base computer runs an operating system (the baseOS), and the virtualization software (e.g., VMware) runs as an application on the baseOS. An operating system runs in the virtualization environment (a guestOS) in order to form a virtual machine. The applications running in the virtual environment on the guestOS have their data organized in the form of files in the file system of the guestOS. There is a single file in the baseOS file system that houses all the data for the applications running in the guestOS, and the guestOS itself. Any baseOS user or application looking at the file data on the baseOS will not be able to read virtual machine data, as the guestOS file system does not necessarily write file data sequentially within the baseOS file representing a virtual machine. Instead, the data extents of the files in a guestOS are present in a seemingly random sequence in the single file on the baseOS. Therefore, an application on the baseOS cannot reorganize the file data and reconstruct the files in the guestOS environment without the help of the file system on the guestOS.
In order to backup such a system in its entirety, a backup operation is run on the baseOS and on each virtual machine guestOS. This is so because individual files in a virtual machine file system can only be restored if the virtual machine is backed up as separate entity. A full backup of the base computer does back up each baseOS file representing a virtual machine. However, because each virtual machine file system stores blocks non-contiguously such that the baseOS file system does not recognize the different blocks as comprising contiguous data, the backup of the baseOS file system cannot be used to restore individual files to a virtual file system.
The same issues arise with incremental backups of base computers running virtual machines. Incremental backup is typically used to minimize the total backup time, thus providing greater efficiency and decreasing resource costs. However, in the case of a base computer running virtual machines, incremental backups have to done at two levels. When a change occurs on a guestOS, an incremental backup of the base computer backups the file that represents the virtual machine. This backup can be used to restore the guestOS as a whole. When individual files in a guestOS file system change, an incremental backup at a guestOS level backs up those individual files. This allows a restore of the files to the guestOS file system. Thus, incrementally backing up a base computer running virtual machines involves backing up the same data at both a virtual machine level and at a base computer level.
Separately backing up virtual machine file system data at both a virtual machine and base computer level results in lot of work duplication and performance overhead. The data for each virtual machine gets backed up twice: once as a part of the backup of the virtual machine itself when the backup job runs on the guestOS, and again when the file representing the virtual machine is backed up on the baseOS. Additionally, both the baseOS and the guestOS file systems get populated with backup data. More memory, media and processing resources are also required to complete the backup, consequently raising the cost of data protection management. The problem only gets worse as the number of virtual machines running on a base computer increases.
What is needed are methods, computer readable media and computer systems for backing up virtual machines from a baseOS level, for example as part of the backup of the base computer.