The present invention relates to backup control apparatus and method for storing a large amount of data which is processed to a medium such as a magnetic tape or the like and for backing up and, more particularly, to backup control method and apparatus for combining a backup of all data and a backup of only updating data as differences after that, and for using those backup data.
Recently, in a data processing system using a computer network such as banking works, insurance works, and the like, the number of direct access storage devices (DASD) such as magnetic disk units or the like which are installed increases and the backup time increases in association with a rapid increase in data amount. Generally, the backup is executed by using a time zone at night in which a work load is small. However, in association with an increase in data amount, the backup time increases so that it obstructs on-line works of the next day.
According to a conventional general backup method, each time a backup process is executed, all data on a magnetic disk unit as a backup target is copied onto a magnetic tape by using, for example, a magnetic tape unit. In the following explanation, a magnetic disk of the magnetic disk unit in which data as a backup target has been stored is called a backup target medium. A magnetic tape which is obtained by copying data of a magnetic disk unit by a magnetic tape unit is called a backup destination medium.
However, according to the conventional backup, since it is necessary to copy the whole data as a target at the backup time, the backup time increases with an increase in data amount. As a backup method to reduce the backup time as mentioned above, there is what is called a differential backup method of backing up only the updated data without backing up the data which was not updated. According to the differential backup method, it is necessary to combine and use a backup of all data (hereinafter referred to as a "whole backup") and a backup of only the updating data which was updated after the whole backup (hereinafter referred to as a "differential backup"). Further, there are the following three types of differential backup;
I. Accumulating type PA1 II. Non-accumulating type PA1 III. Common using type of both of the above types PA1 ID number of the backup target medium, PA1 identifier indicative of a format of backup, and PA1 ID number of the backup destination medium have been stored. As an identifier indicative of the backup format, an identifier indicative of the whole backup, non-accumulating type differential backup, or an accumulating type differential backup is used.
The backup method differs in dependance on which backup is used as a reference in order to discriminate the updating position. That is, according to the accumulating type, the updating data is discriminated by using the previous whole backup as a reference. According to the non-accumulating type, the updating data is discriminated by using the whole backup just before or the differential backup just before as a reference. For example, in case of repeating the backup by a cycle of Monday to Saturday, according to the differential backup of the accumulating type, a whole backup tape is first obtained on first Monday. On Tuesday, a differential backup tape of only the data which was updated by using the whole backup of Monday as a reference. On Wednesday, a differential backup tape of the data which was updated by using the whole backup of Monday as a reference is also similarly obtained. The above processes are repeated until Saturday. Due to this, a differential backup in which the updating data was accumulated every day of the week is executed. On the other hand, according to the differential backup of the non-accumulating type, a whole backup tape is obtained on first Monday and a differential backup tape of only the data which was updated by using the data of the previous day as a reference is obtained with respect to each of the updating backups of remaining Tuesday to Saturday. The recovery process in case of using such a differential backup method requires the whole backup tape and the newest differential backup tape with respect to the accumulating type. The non-accumulating type uses the whole backup tape and all of the differential backup tapes which were obtained until the recovery time point, thereby reconstructing the data.
In this instance, the accumulating type process and the non-accumulating type process have the following advantages and drawbacks. According to the accumulating type process, although the data amount and the processing time at the backup time increase gradually as the differential backup is executed, the management of the backup medium and the recovery process are easy. On the other hand, according to the non-accumulating type process, since only the newest updating data is processed, although the data amount and the processing time at the backup time don't increase, the number of backup destination media gradually increases as the differential backup is executed, and the medium management and the recovery process are complicated.
However, in the backup method using the conventional differential backup process, a work manager or the like who executes the backup manually judges and determines every time whether the whole backup is executed or the differential backup is executed. The judgement about such a use method cannot help depending on his experience, so that it is difficult to correctly use the optimum backup. In the conventional differential backup process, since at least one or a plurality of differential backup destination media are needed in addition to the whole backup destination medium at the recovery time, it took a longer time as compared with the ordinary recovery process in which only the whole backup destination medium is used. Particularly, since the work is interrupted during the recovery, it is necessary to reduce the time.
Further, in the conventional differential backup process, since which potion of the backup target medium had been backed up as updating data was recorded in only the backup destination medium, the necessity of the data can be judged only after the backup destination medium was read out at the recovery time, so that there is a problem such that in case of an unnecessary backup destination medium, the time and the efforts are wasted.
Further, in the recovery process using the conventional differential backup destination medium, the data of the whole backup destination medium is first written and, after that, the updating data of the differential backup destination medium is written. Due to this, in the case where most of the data which had been whole backed up was updated by the differential backup, assuming that the number of backup destination media as an input of the recovery is set to (n), the recovery time of about (n) times is needed as compared with the recovery of only the whole backup destination medium.
Further, in the case where the effective region for updating the data of the backup target medium changes or decreases for the period of the time from the whole backup process to the differential backup process, the data of the region before the change or reduction in which the data was once written back must be erased at the time of recovery in order to protect the secret of the changed or decreased portion, so that it takes a long time for recovery.
In addition, in the recovery process using the conventional differential backup, since the recovery was executed by reading out sequentially the whole backup destination medium and the differential backup destination media one by one, the data to be written out at the recovery time is scattered, so that, for example, even if there are a plurality of input/output apparatuses which can be used at the recovery time, the maximum performance of the hardware cannot be extracted.