The ready ability for a business to store, process and transmit data is a facet of operations that a business relies upon to conduct its day-to-day activities. For a business that increasingly depends upon data for its operations, an inability to store, process, or transmit data can hurt the business' reputation and bottom line. Businesses are therefore taking measures to improve their ability to store, process, transmit, and restore data, as well as more efficiently sharing resources that enable these operations.
The ever-increasing reliance on data and the computing systems that produce, process, distribute, and maintain data in its myriad forms continues to put great demands on techniques for data protection and disaster recovery. Simple systems providing periodic backups of data have given way to more complex and sophisticated data protection schemes. Such schemes can take into consideration a variety of factors, including a wide variety of computing devices and platforms, memory storage systems, numerous different types of data to be protected, speed with which data protection operations must be executed, and flexibility demanded by today's users.
In many cases, disaster recovery involves restoring data to a point-in-time when the desired data was in a known and valid state. Backup schemes to ensure recoverability at times in the past are varied. Such schemes traditionally include periodic full backups followed by a series of differential backups performed at intervals between the full backups. In such a scheme, a data set can be restored at least to a point-in-time of a differential backup. Such an approach can be resource intensive as permanent records of the full and differential backups must be kept in order to ensure that one can restore a dataset to a state at a particular point-in-time, especially to a point in the distant past. Further, the process of restoring data from a full and a series of differential backups can be time and resource consuming, leading to delays in making the data available to the users. In addition, gaps in coverage can occur which are due to the time between differential backups.
A solution to some of the issues presented by data backup and restore from full and differential backups is to create a “snapshot” of data residing in a storage object. Typically, a snapshot involves capturing the data from a primary storage object to another storage object, real or virtual, at a particular instant without causing significant data access downtime. If desired, the resulting snapshot can then be backed up to permanent media, such as tape or optical media, without affecting the performance or availability of the primary storage object. One example of a snapshot backup is a mirror image broken off of a primary data volume.
A mirror image is a complete data copy stored on a separate storage object, virtual or real, physically independent of a primary data volume. Every change or write to data on the primary data volume is also made to the mirror. A mirror can be broken off from an associated primary data volume, meaning that changes after the split will be made to the primary but not to the broken-off mirror. Usually, the broken-off mirror is presented to applications as an independent storage object, often as another volume. While broken-off, this mirror can be backed up or otherwise manipulated. If the mirror will be used again, it must be brought up-to-date with the primary volume or “resynchronized.” Since a mirror image provides a completely separate copy of data on the primary volume, mirror images can provide much faster restores in the event of primary volume unavailability and backups to permanent media (i.e., tapes, optical media), but mirror images require an amount of disk space equal to that of its primary data volume.
A snapshot may not need to be backed up to permanent media (e.g., tape), but instead can be used as a persistent frozen image (PFI). A PFI backup image will allow for a very fast restore of data in the event of problems occurring with a primary volume. But a primary drawback of this approach is that a PFI can take up a significant amount of disk space, whether virtual or real. Thus, it is impractical to retain a series PFI snapshots on disk space for long-term storage. Further, in order to be accessed, each PFI snapshot requires instantiation and storage of information related to the snapshot volume in, for example, a volume manager. Such instantiation also consumes resources.
As stated above, a typical backup scheme involves periodic full backups of data coupled with intermediate scheduled differential backups, along with, in many instances, recording a continuing log of transactions that occur to the primary data volume. Snapshot image backups can be incorporated into such a scheme. Restoring data in such a scheme involves going back to the last full backup or snapshot before the event necessitating a restore, restoring the full backup and then restoring each subsequent differential backup, and finally bringing the data up to a particular point and time through the use of a transaction log. Such a scheme can take a very long time to restore data.
Information technology departments are faced with data demands that require few, if any, gaps in protection of data, along with as little unavailability of data as possible in the event of a data volume failure. Such continuous data protection demands can be solved, in part, through the use of multiple PFI snapshots of the data, but such a protection scheme is resource intensive at both the storage volume level and in the management of those storage volumes as they are presented to the computer systems that use the data. What is therefore desired is a method of maintaining PFI snapshot images, or their equivalent, in a manner that minimizes resource consumption at both the disk level and at the volume manager level.