1. Field of the Invention
This invention relates to the field of data storage devices, and more particularly relates to a method and system for incremental backup.
2. Description of the Related Art
Information drives business. A disaster affecting a data center can cause days or even weeks of unplanned downtime and data loss that could threaten an organization's productivity. For businesses that increasingly depend on data and information for their day-to-day operations, this unplanned downtime can also hurt their reputations and bottom lines. Businesses are becoming increasingly aware of these costs and are taking measures to plan for and recover from disasters.
Two areas of concern when a failure occurs, as well as during the subsequent recovery, are preventing data loss and maintaining data consistency between primary and secondary storage areas. One simple strategy includes backing up data onto a storage medium such as a tape, with copies stored in an offsite vault. Duplicate copies of backup tapes may be stored onsite and offsite. More complex solutions include replicating data from local computer systems to backup local computer systems and/or to computer systems at remote sites.
Not only can the loss of data be critical, the failure of hardware and/or software can cause substantial disruption. In many situations, disaster recovery requires the ability to move a software application and associated data to an alternate site for an extended period, or even permanently, as a result of an event, such as a fire, that destroys a site. For these more complicated situations, strategies and products to reduce or eliminate the threat of data loss and minimize downtime in the face of a site-wide disaster are becoming increasingly available.
For example, replication facilities exist that replicate data in real time to a disaster-safe location. Data are continuously replicated from a primary node, which may correspond to a computer system in control of a storage device, to a secondary node. The nodes to which data are copied may reside in local backup clusters or in remote “failover” sites, which can take over when another site fails. Replication allows persistent availability of data at all sites.
The terms “primary node” and “secondary node” are used in the context of a particular software application, such that a primary node for one application may serve as a secondary node for another application. Similarly, a secondary node for another application may serve as a primary node for that application.
The term “application group” is used to describe both an application and the corresponding data. If a primary application group on one cluster becomes unavailable for any reason, replication enables both the application and the data to be immediately available using the secondary application group in another cluster or site.
To accommodate the variety of business needs, some replication facilities provide remote mirroring of data and replicating data over a wide area or distributed network such as the Internet. However, different types of storage typically require different replication methods. Replication facilities are available for a variety of storage solutions, such as database replication products and file system replication products, although typically a different replication facility is required for each type of storage solution.
Replication facilities provide such functionality as enabling a primary and secondary node to reverse roles when both are functioning properly. Reversing roles involves such replication operations as stopping the application controlling the replicated data, demoting the primary node to a secondary node, promoting the original secondary node to a primary node, and re-starting the application at the new primary node. Another example of functionality of a replication facility involves determining when a primary node is down, promoting the secondary node to a primary node, enabling transaction logging and starting the application that controls the replicated data on the new primary node. In addition, when the former primary node recovers from failure, the replication facility can prevent the application from starting at the former primary node since the application group is already running at the newly-promoted node, the former secondary node. The transaction log can be used to synchronize data at the former and new primary nodes.
It is also important to be able to backup replicated data, as is the case with any data. Conceptually, the backup process is relatively simple. The difficulties faced by system administrators include the following. One challenge is the impact on resources. It is obviously important to get backup done as quickly as possible. But to finish faster, data has to be copied faster, which implies greater demands on disks and input/output (I/O) channel bandwidth. But disks and channels that are busy with backup requests can't process transactions. Simply put, the more I/O resources a backup methodology uses, the slower online operations become.
Another challenge is the timing of such backup operations. In order to represent a consistent point-in-time image, backups need to be started at a time when no other activity is occurring. Thus, backups are constrained to start at times when the business impact of stopping accesses to the data is lowest. The bigger an enterprise, the more data needing backup the business is likely to possess. Because backup is very resource intensive, as noted, large enterprises invariably wish to minimize its impact on operations.
Given the foregoing, the conflicting information technology imperatives of protecting enterprise data against failures of all kinds and continuous operation/availability make backup operations (e.g., database backup) a difficult problem for administrators. On the one hand, frequent, consistent backups need to be maintained in case data recovery is necessary. But taking data out of service for backup is often not a realistic option for many installations. Even if such data didn't have to be online continuously, the I/O resource impact tends to make frequent full backups impractical. What is needed is a technique that enables backup of such data without the impact caused by a full backup operation. Moreover, such a technique should preferably take advantage of the infrastructure provided by existing recovery technologies.