Computer systems often perform data backups on computer files to enable recovery of lost data. To maintain the integrity of the backed-up data, a backup process must accurately back up all files or back up all modified files after the most recent backup process. A backup program copies each file that is identified as a candidate for backup from an on-line storage device to a secondary storage device. On-line storage devices are configured from on one or more disks into logical units of storage space referred to herein as "containers". Containers are created and maintained by a software entity called the "container manager". Each type of container on the system has an associated driver which processes system requests on that type of container. After a complete backup operation, the backup program verifies the backed up files to make sure that the files on the secondary storage device(usually a tape) were correctly backed up. One problem with the backup process is that files may change during the backup operation.
To avoid backing up files modified during the backup process and to enable applications to access files during the backup operation, the container manager periodically (e.g. once a day) performs a procedure that takes a "snapshot" or copy of each read-write container whereby, the container manager creates a read-only container which looks like a copy of the data in the read-write container at a particular instant in time. Thereafter, the container manager performs a "copy-on-write " procedure where an unmodified copy of data in the read-write container is copied to a read-only backup container every time there is a request to modify data in the read-write container. The container manager uses the copy-on-write method to maintain the snapshot and to enable backup processes to access and back up an unchanging, read-only copy of the on-line data at the instant the snapshot was created.
During the backup procedure, the container manager creates a "snapshot" container, a "snapshotted" container and a "backing store " container. After the container manager takes the snapshot, the snapshotted container driver processes all input/output (I/O) requests, to store data in or retrieve data from a read-write container. The snapshotted container driver processes all I/O requests to retrieve data from the read-write container by forwarding them directly to the read-write container driver. However for all I/O requests to modify data in a read-write container, the container manager first determines whether the requested block of data has been modified since the time of the snapshot. If the block has not been modified, the container manager copies the data to the backing store container and then sets an associated bit map flag in a modified-bit-map table. The modified-bit-map table contains a bit map with each bit representing one block of data in the read-write container. After setting the modified-bit-map flag, the snapshotted container driver forwards the I/O storage request to the read-write container driver.
When the backup process begins execution, it invokes I/O retrieval requests from the snapshot container. A file system, which is a component of the operating system translates the file-oriented I/O request into a logical address and forwards the request to a snapshot container driver. The snapshot container driver checks the associated bit map in the modified-bit-map table for the requested block of data. If the bit map is set, the snapshot container driver forwards the request to the backing store container driver to retrieve the unmodified copy of that block from the backing store container. The backing store container driver then processes the backup process retrieval request. If the bit map is not set, this means that the block has not been modified since the snapshot was created. The snapshot container driver forwards the request to the read-write container driver to retrieve a copy of that block of data from the read-write container. Upon retrieving the file from the backing store container or the read-write container, the backup process backs it up. After a complete backup operation, the container manager deletes the snapshotted container, the snapshot container, the backing store container, and the modified-bit-map table and thereafter forwards all I/O requests directly to the read-write container driver.
The problem with the current copy-on-write process is that the read-write container and the backing store container must be the same size to maintain a fixed mapping between the read-write container blocks and the copied backing store container blocks. Usually, however, only a small amount of the on-line data is modified between backup operations, the present copy-on-write process therefore utilizes storage space inefficiently. Therefore, it is an object of the present invention to provide a system that allows copy-on-write procedures to be performed on a backing store container that is smaller than the read-write container while ensuring that the read-write container blocks are accurately mapped to the copied backing store container blocks.