The present invention relates to computer systems, and particularly to a specific configuration of a computer system and to a backup management method based on this configuration.
Information systems are generally configured to backup data so that data can be recovered if it is lost due to storage device failures, data corruption caused by virus attacks, user errors, or the like.
Data backup and recovery techniques using journaling have been proposed to solve the problem of data loss (see, e.g., U.S. Patent Application Publication No. 2005/0015416). This document discloses a technique in which a snapshot (i.e., a full backup image or a logical image including a differential backup image), at a specific point in time, of a logical group (hereinafter called “journal group”) containing at least one data volume is obtained; data subsequently written to the data volume belonging to the journal group is maintained as a journal (called “After journal” or “post journal”); a series of After journals is applied to the obtained snapshot in the order in which data was written; and thus data at the specific point in time can be recovered. The technique disclosed in this document is an example of a technique generally called “continuous data protection (CDP)”.
The time required for data recovery by the above-described journal application (hereinafter called “recovery time”) depends on the amount of data of journals applied. Therefore, the above-described U.S. Patent Application Publication No. 2005/0015416 further discloses a technique in which, for the recovery of data at a specific point in time, multiple generations of snapshots are taken; A series of journals are applied to a snapshot taken in close proximity to this specific point in time. and thus the recovery time can be reduced. The worst value of the time required for data recovery is called “recovery time objective (RTO)”.
Another technique proposed is one in which, if data recovered by the application of After journals has already been corrupted, this application of After journals is cancelled (see, e.g., U.S. Patent Application Publication No. 2005/0015416). In the technique disclosed in this document, a portion of data to be overwritten by the application of After journals is saved. Then, if the application of After journals needs to be cancelled, the saved data is written onto a snapshot to which the After journals were applied (i.e., the saved data is restored to its original location). Thus, the snapshot before the application of the After journals can be restored in a short period of time. The saved data described above is called “Before journal”. The term “journal” is used to refer collectively to both Before and After journals.
Hereinafter, a point designated by an administrator for the purpose of recovering data at a specific point in time will be referred to as “recovery point”. When data at a designated recovery point is to be recovered by the application of a series of journals to a snapshot, the relationship between the last applied journal and the designated recovery point will be expressed as “the journal has the recovery point.” If data can be recovered without applying a journal to a snapshot, the relationship between this snapshot and a designated recovery point will be expressed as “the snapshot has the recovery point.” Moreover, when a specific journal is applied to a snapshot, a journal created, for canceling the application of the specific journal, by saving a portion of data to be overwritten is called “corresponding journal” or “saved journal”.
In case of data corruption, an increase in recovery time generally causes increases in operational downtime and damage. Therefore, it is important to reduce recovery time and recover data at any point in time within the time period requested by an administrator (hereinafter expressed as “to satisfy a requested RTO”). In general, a snapshot is periodically taken (e.g., every hour) and multiple generations of snapshots are maintained to reduce recovery time. However, recovery time depends on the amount of journal data to be applied to the snapshot, and the amount of journal data to be applied depends on the amount and frequency of writing in practice. A problem here is that if a snapshot is taken periodically, there will be a recovery point that cannot satisfy a requested RTO.