1. Field of the Invention
This invention relates to the field of computer processing and, more particularly, to avoiding duplicate backups of data in a volume backup image.
2. Description of the Related Art
As computer memory storage and data bandwidth increase, so does the amount and complexity of data that businesses manage. In order to protect such data, the contents of information servers and end-user systems may be backed up to a data storage subsystem. In many cases a backup agent on each client is configured to convey data files to a backup server according to a variety of schedules, policies, etc. The data storage subsystem may include a backup system configured by an information technology (IT) administrator.
In addition to desiring an efficient data backup system, a company or workgroup may require high availability of services provided by the computing system. In order to increase availability, clusters of multi-processor nodes may be coupled via a network. With cluster computing, a redundant node or other node in the cluster provides service when a first node fails. However, one issue that arises with clustered computing that utilizes shared storage is that nodes, and virtual machines (VMs) within nodes, are dependent on disk resources. In the event a physical disk resource is moved from one node to another, fast live migration of applications and/or VMs may not possible. At the end of the migration the volume must be un-mounted and then mounted again, both of which are time consuming tasks. Without fast migration, one of the benefits of clustered computing is reduced. Another issue that may arise is the VM and all dependent resources corresponding to a same logical unit number (LUN) form a dependent group that can only be moved or failed over as a complete unit. When moving or performing a failover for one VM or LUN, a move or failover is performed for all of the VMs and resources, such as the volume, in the group.
One solution to the issues mentioned above is a cluster shared volume that is simultaneously visible and accessible to all cluster nodes. Such a volume may be a standard cluster disk containing an NTFS volume that is made directly accessible for read and write operations by all nodes within the cluster. This gives a VM mobility throughout the cluster, as any node can be an owner. Fast migration may be possible with such a volume. However, during a volume-level backup operation, some data may be backed up twice. For example, customers may initially perform a volume-level backup to obtain full volume flat file backups. At a later time, customers may perform an agent-based backup operation of particular data used by a given node. The agent-based backup temporarily disables applications, such as database applications, on the corresponding node and backs up individual files corresponding to the applications.
Because a node performing a volume level backup has no knowledge of which portions of a volume are used by particular applications on another node, a volume level backup will simply perform a backup of the entire volume. Subsequently, when the agent level backup is performed, the agent level backup will backup data which was also backed up during the volume level backup.
In view of the above, methods and mechanisms for avoiding duplicate backups of data in a volume backup image are desired.