1. Field of the Invention
The present invention relates generally to backing up a data set, and in particular to a method and system for performing incremental backups of a data set to facilitate efficient restoration of data.
2. Description of the Related Art
Companies and organizations are coping with managing and storing growing amounts of data. As the amount of data generated and stored within organizations escalates, the time and space needed to backup all of the company's data also increases. Consequently, organizations are looking for methods of performing backups that can take less time and use up less storage space. One of the methods of performing backups more efficiently is the use of incremental backups.
A typical backup process will start with a full backup of the data. After the full backup, the company can perform incremental backups to reduce the amount of data being stored and the backup processing time. An incremental backup involves storing only the data that is new or has changed in a data set since the last full or incremental backup. Many incremental backups can be performed in succession between full backups. Using incremental backups can reduce the amount of data that is stored, but it may complicate the restoration process.
Another method of performing backups more efficiently is to employ a deduplication process to reduce the storage of duplicate data. The process of data deduplication is often utilized in backup storage applications. Backup applications generally benefit the most from deduplication due to the requirement for recurrent backups of an existing file system. Typically, most of the files within the file system will not change between consecutive backups, and therefore do not need to be stored.
While creating incremental backups in a deduplication based storage system may decrease storage utilization, it may also add to the complexity and inefficiencies of performing restoration operations. Incremental backups do not contain all of the data items that exist within a data set, and so restoring data following an incremental backup may involve processing multiple incremental backups. This is true whether or not the stored data is deduplicated. Restoration operations tend to be most efficient when performed following a full backup, when all of the data items from a data set are backed up in a single process to a single image, and which are identified in a single catalog. What is needed in the art is a system for performing incremental backups in a deduplication system with minimal data movement while also enabling fast restore operations as if restoring from a full backup.
In view of the above, improved methods and mechanisms for performing incremental backups to allow efficient restoration operations are desired.