1. Technical Field
The present invention relates in general to a data processing system and, in particular, to a method and system for fast backup and transmission of data. Still more particularly, the present invention relates to a method and system for fast backup and transmission of data to ensure that the data cannot be altered.
2. Description of the Related Art
To protect data holdings from being lost, there is a need for a regular process in which the data is saved or backed up on a data storage media. This regular process of saving data is often referred to as performing a xe2x80x9cbackupxe2x80x9d.
Because of the increasing volumes of data being stored and the more insistent demands for data security, the amount of data being backed up and the frequency of scheduled backups continues to increase in many systems.
Conventionally, backups are performed at times other than the xe2x80x9con-line timesxe2x80x9d, such as at night, and the backed-up data is stored on magnetic tape, magnetic disk, or other media at secure locations. In the event of data being lost, a user can retrieve data from the most recent backup in order to keep the loss as small as possible.
In particular, backups are typically performed during the night shift because the data must not change during a backup. A change in data during a backup typically causes synchronization problems. However, as the volumes of data being backed up are becoming larger and larger, the night is often not long enough to allow all the necessary data to be backed up.
Given the demand for faster backup facilities, the International Business Machines (IBM) RAMAC (Random Access Memory) Virtual Array (IBM RVA) magnetic disk storage system was developed which has an xe2x80x9cIXFP(IBM Extended Facilities Product)/SnapShot(trademark)xe2x80x9d function, referred to in what follows as xe2x80x9cSnapShotxe2x80x9d.
This SnapShot function allows entire disks to be copied in a very short time, e.g. from seconds to minutes. Once the disks have been copied, then on-line operation can start and all the backups can be made from the copies at any other time.
Details of the SnapShot function can be found in the technical specification relating to Snapshot. For the purposes of the present invention, it is desirable that copies of disks can be made in a very short time by a function such as Snapshot. Since the present invention is not dependent on the SnapShot function as the only method of quickly copying disks, the term xe2x80x9ccopied disksxe2x80x9d will be used in the following description to refer to xe2x80x9cSnapshotxe2x80x9d copies and other conceivable methods of copying disks.
Copied disks as a basis for backups are disadvantageous in that after a copy is performed by SnapShot it may not be possible for various databases and file systems which utilize particular methods of access and identification, e.g. which rely on the source disk name, to be further processed or saved. For this reason the fast copy performed by SnapShot cannot readily be used as a basis for backups by most operating systems.
Certain operating systems that are enabled to utilize copied disks as a basis for backups change the disk (i.e. volume) identification (referred to below as xe2x80x9cVOLIDxe2x80x9d), thus avoiding xe2x80x9cduplicate namesxe2x80x9d. This is the method utilized by, for example, the IBM OS/390 operating system. However, this method of changing the disk VOLID only works because in the OS/390 operating system the files are described or denoted by simple catalogue structures.
In addition, while some operating system, such as OS/390, provide for making changes to the VOLIDs during backups, there are certain crucial drawbacks to making changes to the copied disks as a basis for backups. First, it is only possible to make changes to the copied disks if there are no catalog structures denoting the files or if the catalog structures which do denote them are simple. If the structures are complicated, errors may be caused by making changes to copied disks and the process of making changes is time-consuming. Second, making changes to the copied disks goes against the normal backup philosophy by changing a xe2x80x9cfrozenxe2x80x9d, or unchanged, data holding at a later stage. Third, the data holdings on the amended disks could be changed by ordinary applications. Finally, the amended data cannot be copied straight back to the original source disk for disaster recovery. Therefore, it is generally safer for the copied disks to be left in the frozen state.
In particular, with reference to the figures, FIG. 1A-1C depict illustrative block diagrams of a conventional method for backing up data. First, FIG. 1A depicts copying the disks VOLID1 and VOLID2 by the fastest possible method (e.g. SnapShot). For simplicity""s sake, the disk name (VOLID) is changed at the same time of the copy.
Next, as illustrated in FIG. 1B, the descriptors or identifiers of the files are adapted to the amended disk names in the appropriate database and file catalogs, as is currently performed by the IBM OS/390 operating system. The process of adapting the descriptors or identifiers of the files to the amended disk names can be very complicated, utilizes a substantial amount of time, and can lead to errors on the copied disks.
Thereafter, as depicted in FIG. 1C, the backup program can read the copied disks and produce data backups. However, the method of the prior art is disadvantageous in that any applications can read from and write to the copied disks, which is risky and undesirable.
An example of a Virtual Storage Extended/Enterprise Systems Architecture (VSE/ESA) installation operated by a fairly large user (a typical user of the SnapShot backup method) may contain 20 catalogs each covering 500 files (including database systems) and 200 alternative indices, distributed over some 50 to 100 disks. The average is therefore 75 disks utilized for a backup. Also, paths to the alternative indices and non-VSAM files are recorded. Therefore, the total number of disk changes involved in each backup is 12,640; giving a very clear picture of the amount of time involved in a backup and the risk of data being lost.
Another disadvantage of the prior art method, as depicted in FIG. 1C, is that the amended disks cannot be copied back in the event of a loss from the original (source) disks without first being changed back to the original disk identification.
Therefore, in view of the foregoing, it would be advantageous to provide a method and system for backing up and transmitting data which ensures that the times taken by breaks for backing up or transmitting data are as short as possible and it is ensured that the copied data cannot be altered.
In view of the foregoing, it is therefore an object of the present invention to provide an improved data processing system.
It is another object of the present invention to provide a method and system for fast backup and transmission of data.
It is yet another object of the present invention to provide a method and system for fast backup and transmission of data to ensure that the data cannot be altered.
In accordance with the present invention, the fastest possible one-to-one copying of data files which are to be backed up is performed from a source storage medium to a target storage medium. The data files include usable data and management data. After the copying, data held on the source storage medium is accessible to users during the backup. To perform the backup, a record table or synonym list provides access to the files which have been copied. From the record table, management data is temporarily replaced in order to meet access requirements for opening the copied data files at the target storage medium. The copied management data and useful data on the target storage medium remain unchanged. Advantageously, copied data files on the target storage medium can no longer be changed after copying and apart from the authorized backup program, no application can read the copied data files from the target storage medium.