Computer systems process large volumes of data which may be changed or updated on a recurring basis. In order to track these changes, files located on a data storage device are usually copied, and a system data backup based on these files is created. This enables a user to access previous versions of files, as well as to protect data from possible system failure. Snapshots of the data storage device may be taken to provide for backup of file systems. In the case of a physical computing machine, these snapshots may be generated by an Operating System (OS) running on the physical computing machine.
The current industry trend of virtualization and distribution of computer system resources makes the task of generating backups more complex. For example, a plurality of virtual machines (VMs) may each be configured to provide a software emulation of a single physical computing machine. Virtualization allows running a number of VMs on the same physical computing machine or processor. Each VM instance executes its own OS kernel. Support of VMs is implemented using a VM Monitor and/or a Hypervisor. Due to the existence of multiple VMs on the physical computing machine, scheduling and controlling efficient backups of data among the numerous VMs becomes challenging. Furthermore, each of the VMs has a configuration that can be changed by a user. Thus, it may be necessary to save snapshots of all previous states of a particular VM into a backup.
Typically, data backups are performed by system administrators according to a predetermined backup schedule. In many situations, there are customers who keep tens of snapshots of volumes for tens of thousands of user file systems. Any time a change is made to a file system account, an enormous amount of redundancy is provided just to ensure that a stable backup of all account information is retained.
FIG. 2 is a data flow diagram illustrating a generation of data snapshots in accordance with a prior art approach. A first series of snapshots are taken for a first virtual machine 201, a second series of snapshots are taken for a second virtual machine 202, and an Nth series of snapshots are taken for an Nth virtual machine 203, where N is a positive integer greater than two. The first series of snapshots comprises a first virtual machine first snapshot 211, a first virtual machine second snapshot 212, and a first virtual machine Mth snapshot 213, where M is a positive integer greater than two. Likewise, the second series of snapshots comprises a second virtual machine first snapshot 221, a second virtual machine second snapshot 222, and a second virtual machine Mth snapshot 223. Similarly, the Nth series of snapshots comprises an Nth virtual machine first snapshot 231, an Nth virtual machine second snapshot 232, and an Nth virtual machine Mth snapshot 233. The result is a separate set of snapshots for each of the virtual machines 201, 202, and 203. These snapshots require a lot of storage space and may include redundant information.
Instead of taking daily snapshots, incremental flash copies of the volume of data storage space can be taken. This procedure only copies data that has changed between a source volume and a target volume.
“Version control” is a component of software configuration management. More specifically, version control is the management of changes to documents, computer programs, web sites, and other information collections. Typically, changes are identified by a number or letter code (called the revision number, the revision level, or simply the revision). Typically, each revision is associated with a timestamp and an entity making the change from one revision to the next. Revisions can be compared, restored, and (at least with some types of files) merged. Version control systems (VCS) typically operate as stand-alone applications, but version control is sometimes embedded in various types of software such as word processors and in various content management systems (for example, page history for a wiki page). Version control allows for the ability to revert a document to a previous revision, which is helps allow editors to track each other's edits, correct mistakes, and defend against malicious actions.