Businesses or other entities often store data (e.g., customer lists, financial transactions, business documents, business transactions, etc.) in files of a file system. File systems, in turn, are usually stored on one or more logical data volumes. The present invention will be described with reference to a file system that is stored on a single data volume, it being understood that the present invention should not be limited thereto.
File memory space is divided into extents for storing data of respective files of a file system. Each extent is defined by a starting address in the file memory space and an extent length. File extents are typically divided into extent blocks of equal size. Each extent stores or is able to store a portion of data of an associated file. In the configuration where the file system is stored on a data volume, each file extent block is mapped to a respective block of the data volume. While it is said that each extent block of a file stores data, the data is stored in a respectively allocated block of the data volume.
Data, including data stored in files, is susceptible to corruption. Programming errors unintentionally added into a poorly developed software application may inadvertently corrupt data in files upon which the software application operates. Further, users often unwittingly delete or overwrite important data. Data corruption can be devastating to a business, including those that rely heavily on electronic commerce. Recognizing the importance of maintaining reliable data, businesses or other entities typically employ backup and restore systems to protect themselves against unexpected data corruption. The present invention will be described with reference to restoring a file system using backup copies of a data volume upon which the file system is stored, it being understood that the present invention should not be limited thereto.
Backup systems can create a copy (i.e., backup copy) of a data volume at a point-in-time. One method of creating a backup copy is to copy the data from the volume to one or more magnetic tapes. If data is subsequently corrupted, the volume containing the corrupted data can be replaced in entirety with the most recently created backup copy or backup copies. When replaced, everything that happened to the volume since creation of the backup copy (including the event that caused the data corruption) can be forgotten, and the state of operations (as reflected in the data) is restored to that point-in-time.
Backup operations create backup copies that may be either full or incremental. A full backup copy usually means all data within the volume is copied at the time of backup. An incremental backup usually means that only those blocks of the volume that have changed since some previous event (e.g., a prior full backup or incremental backup) are copied. The backup window (i.e., the time allotted) to complete a full backup, however, tends to be much larger when compared to the backup window for an incremental backup. When backup windows are required to be small, incremental backup is preferable since, in most cases, the number of blocks of the data volume that change between backups is very small compared to the number of blocks in the entire data volume. If backups are done daily or even more frequently, it is not uncommon for less than 1% of blocks of a volume to change between backups. An incremental backup operation in this case copies only 1% of the data that a full backup operation would copy and uses 1% of the input/output (IO) resources needed between the data volume and the backup tapes.
Incremental backup appears to be the preferred mode of protecting important data. And so it is, until a full restore of all blocks of a data volume is needed. A full restore from incremental backups first requires restoring the corrupted data volume to the newest full backup copy, followed by restores to all of the newest incremental backups. That can require a lot of magnetic tape handling performed by, for example, an automated robotic handler. In contrast, restores using just a full backup copy is genuinely simpler and more reliable than restores from combinations of full and incremental backup copies.
While backup and restore systems are useful, they present a number of disadvantages. As noted, backup copies are typically created during backup windows. During backup windows, read and write access to the data volume is denied while the volume is being backed up to one or more magnetic tapes. Additionally, even if a backup copy is created at the top of every hour, a data corruption event that occurs at 12:59 would require the data volume to be restored to the backup copy created at 12:00, and all valid modifications of the data volume entered between 12:00 and 12:59 may be lost.