The invention relates generally to computer systems, and deals more particularly with a computer system for managing access to files in a primary repository and backing-up the files to a backup repository.
Many computer systems include a file manager program and a backup program or utility. The file manager program controls storage of files (or other data) in a primary repository and manages requests by application programs to access the stored files. In a shared file system, the file manager permits multiple application programs to access the same file. The primary repository may take the form of a set of direct access storage device (DASD) disks, and data for a single file may reside on one or more of these DASD disks. The backup utility has the responsibility to backup or copy the files from the primary repository to the backup repository to safeguard the files from logical or physical damage. The backup repository may take the form of a magnetic tape.
The backup procedure often requires a substantial amount of time, minutes or even hours, to copy the data to the backup repository, depending on the amount of data to be copied and the operating speed of the storage device. A previously known file manager activates a lock on the files during the backup period to prevent any application program from updating the files. This is important to ensure that the backed-up files are "consistent" i.e. represent a "snap shot" of a set of files at the point in time when the backup began. After the backup is complete, the lock is deactivated. While this approach provides consistency, it causes delays to any application program which requires write access to the files during the backup period. Another previously known file manager permits any application program to update the files during the backup period. This approach creates no delays to the application programs but risks inconsistency in the backup copy.
A previously known Unix Plan 9 file manager operates as follows to backup a file directory. All the files are stored on disk and initially referenced by a first directory. At a predetermined time, such as five o'clock PM every day, all directories (but not the files) are backed-up, i.e. a second directory is defined which points to the same files as the first directory. This is the extent of the backup procedure.
The Unix Plan 9 also maintains a historical copy of each file in the following manner. Whenever a request is made to update a file in the first directory, the file is opened and a copy or shadow of each file is made without the update to serve as a historical copy. This requires time and DASD storage. The first directory continues to point to the original file and the second directory is made to point to the "historical" shadow file. Then, the update is made to the files corresponding to the first directory, and the updated file is closed. During the update period (which is short because the update is made to another location on the same disk and not to tape), any application program can access the historical shadow copy of the file via the second directory.
Another previously known file manager permits application programs to update files while the files are being backed-up and ensures a consistent backup copy. This file manager operates as follows. When a backup is initiated, the file manager writes a copy of all the files to be backed-up to tape. During the backup period, any application program can update the copy. After the backup is complete, the file manager scans the primary file repository directory to determine if any updates were made to the files that were backed up. If so, the file manager again backs-up the updated files to tape. This process is repeated a finite number of times or until a scan reveals no new updates. This technique is inefficient because it always requires a complete copy of the files to be backed-up and may require repeated reading of the primary file repository and multiple file backups to the backup repository if updates occur during the backup procedure.
A previously known IBM VM/SP 6 operating system and associated file manager operate as follows to provide a consistent view of data objects within a file stored in DASD and permit other application programs to update the data objects while the file is being read. When each file is opened for reading, the file manager makes a copy of all pointers from the file to all data objects within the file. Then, the reader (application program) proceeds to read the data objects. If another application program requests an update to one of the data objects during the reading process, then the file manager copies the data objects for which update is requested into RAM, and this other application program makes the updates to the copy in RAM and requests that the updates be committed. Then, the file manager writes the updated copy to a new location in DASD, and one set of pointers on DASD is changed to point to the new location. While this technique is effective in providing consistent reading and updating by other application programs, this technique requires copying of each pointer to each data object within the file which is opened for reading, and there can be thousands of data objects and respective pointers in each file. Also, this technique is limited to providing consistency at the file level only.
A general object of the present invention is to provide a file management and backup system which permits application programs to update a file while the file is being backed-up, yet provides a consistent backup copy and minimizes overhead associated with the backup and the storage required in memory (RAM) and the primary repository.
Another object of the present invention is to provide a file management and backup system of the foregoing type which minimizes overhead and memory burden when only one file of a set or one data object in a file is updated during backup of the entire file set or entire file, respectively.