1. Field of the Invention
The present invention relates generally to database backups, and in particular to a method and system for performing incremental backups of a SQL server database.
2. Description of the Related Art
Database systems, such as a Microsoft® structured query language (SQL) server database system, contain vast amounts of information generated and used by a variety of software applications. Because of the importance of the information stored in databases, a database system provides mechanisms to back up and restore the databases managed by that system. A backup mechanism may create a complete copy of the database, and the restore mechanism may overwrite the database with the copy. A database system may also be capable of playing back transactions to restore a database. Such a database system may log all transactions and use those transactions to restore a database to a particular state by starting at a known prior state and applying transactions that occurred after that state.
A database system may also provide a snapshot mechanism that allows the state of a database to be preserved in a “snapshot.” Typically, taking a snapshot is a precursor to performing a full backup. Performing a snapshot is a way to minimize the disruption to the SQL server, as compared to performing a streaming backup. Microsoft SQL server supports snapshot backup and restore technologies. A snapshot backup is a specialized backup that is created almost instantaneously. A snapshot may be created by various well-known techniques, including copy-only backup, split-mirror, specialized hardware that creates a copy of a storage device, and other methods. Snapshot backups may minimize or eliminate the use of the resources of the SQL server to accomplish the backup. This may allow the availability of the SQL server to be minimally impacted by performing a backup operation.
Database systems utilize snapshots for a variety of reasons. One typical use of snapshots is to copy a database without disabling access to the database for a long period of time. After performing the snapshot, the database system can then copy the database by leveraging the snapshot of the database. Thus, the database system performs a full backup of the primary database when the primary database is active. In general, a snapshot records the state of the database at a certain point in time. That is, the snapshot may be used to provide a point-in-time image of a live database. Additional operations can then be performed using the snapshot copy without affecting the performance of the live database.
Another goal, besides reducing the downtime of an active SQL server database, is to reduce backup storage utilization. To achieve this goal, differential backups are often performed instead of full backups. Performing differential backups takes advantage of a common characteristic of SQL server databases—typically, only a small percentage of the data in the database changes in between consecutive backups. With only part of the data actually changing between backups, it increases the efficiency of the backups to perform a differential backup instead of a full backup. The time necessary to complete the backup may be reduced, limiting the downtime of the database, and the amount of data stored in the backup may also be reduced, decreasing the amount of storage required to store the backup.
SQL server maintains a bitmap with information on which extents have changed since the last full backup. This bitmap is called the differential change map (DCM), and SQL server uses the DCM to perform differential backups. The DCM tracks the extents that have changed since the last full database backup. Extents are a collection of eight physically contiguous pages and may be used to efficiently manage pages. The DCM is a bitmap where each bit represents a single extent. The bitmap is organized such that if the bit for an extent is 1, then the extent has been modified since the last full backup and if the bit for an extent is 0, then the extent has not been modified.
SQL server also maintains allocation maps to record the allocation of extents to objects in the database. One of the allocation maps is the index allocation map (IAM). The IAM contains information about the extents that a table or index uses. Another of the allocation maps is the global allocation map (GAM). The GAM contains information about which extents have been allocated. Another of the allocation maps is the shared global allocation map (SGAM). The SGAM tracks mixed extents that have at least one unused page. There is also a way to track free space, called the page free space (PFS) pages. The PFS pages record the allocation status of each page, such as whether an individual page has been allocated and the amount of free space on each page.
A common technique used for administering a SQL database is to preallocate extra space to the database to give it room to expand. When a database file is mounted by a SQL server, and when the size of the database file needs to increase beyond the size allocated to it to accommodate new transactions, it is a time-consuming process to increase the size of the database. Therefore, it is customary for the size of the SQL database to be much larger than the amount of data it currently stores. However, when performing a full backup, all of the space allotted to the database, even if it is not being used, will be backed up, thus increasing the size of backups. Also, if a table or other object has been deleted from the database, a full backup will still back up all the space that the deleted table or object took up.
Differential backups may be used by SQL server to reduce the size of the backups, but differential backups have some drawbacks. For example, a differential backup must be restored to a live instance of the database, which will result in downtime of the database. SQL server also support log backups to backup a list of transactions that have occurred; like differential backups, log backups need to be played back to a live instance of the database during restoration. It would be preferable to prepare a full copy of a point-in-time database file for restoration without requiring any assistance from the SQL server.
Another way to reduce the size of database backups, other than performing differential or log backups, is to perform incremental backups. However, SQL server currently does not support incremental backups of the database. Therefore, what is needed is a way to perform an incremental backup making use of the snapshot and data tracking mechanisms maintained by SQL server, while also preserving the ability to do a fast, efficient restoration from the incremental backup.
In view of the above, methods and mechanisms for performing incremental backups of a SQL database are desired.