There is a need in virtually all database systems to periodically produce a backup copy, or dump, of the contents of the stored information in the system. This is because the stored information in databases typically changes over time. A backup copy of the contents of a database at some point in time assures that the database can be restored to the state existing at that point in time in case of a malfunction that corrupts the database.
The classic method of producing a database backup is to establish a system state in which modifications to the database are disallowed and then to copy the entire database information directly onto a backup disk or tape. Unfortunately, in many types of databases this method incurs a large expense in terms of system unavailability to users. For example, according to an article by Kaunitz and Ekert appearing in the Australian Computer Journal, Vol. 13, No. 4, November 1981, entitled "Database Backup-The Problem of Very Large Databases," the time and resources required to backup databases larger than one billion bytes are viewed as critical problems by the database users.
A similar problem can occur in a small distributed database system if a backup of the distributed database is to be produced on a single resource, such as a disk. If the communication channels between the distributed database are slow or are congested, the real time required to take a full backup of each database site may approach that of a large centralized database. Thus, the disadvantages inherent in the classic method of dumping may be present.
The above article by Kaunitz and Ekert surveys several techniques which have been proposed to alleviate problems of database backup. According to the article, some systems provide an elaborate dynamic backup mechanism whereby database backups are made entirely in the presence of normal user processing, including ongoing modification of the database information. In these techniques, since a backup completed at any given time usually does not represent a consistent state of the database, other means must be used to place the database reloaded from a backup file in a logically consistent state. A typical method of doing this is to maintain an audit trail of modifications to a database as the modifications occur. Having reloaded the database from a backup file, the audit trail must be reprocessed from the earliest point in time reflected by the backup file to the time that the database was corrupted. Then the effect of modifications that were in progress at the time of failure must be removed by some process. Obviously, this is an expensive and complicated process.
Kaunitz and Ekert also mention a differential backup technique as an alternative to the classic backup technique. The database is viewed as consisting of a number of pages of fixed size. A full backup is initially taken of the database. Thereafter, as modifications are made to the database, bit indications are entered into a bit map according to the pages that are affected. Eventually a backup differential file is created by saving only those pages that have been modified as indicated by the bit map. Periodically, this differential backup file is updated with the pages that changed since the last differential backup. At some point, the differential backup file is either merged with the full backup file or a new full backup is taken.
The differential method above also has certain disadvantages. First, a separate differential backup file must be maintained in addition to a full backup file. In the event that the database must be rebuilt, the full backup must be reloaded and then merged with the differential file. Secondly, the method does not address the problems of simply and cheaply allowing the creation of backup files, while also reducing or eliminating the time that a system is unavailable for updates during a backup process.