1. Field of the Invention
This invention relates to backup and recovery of data.
2. Related Art
Computer systems typically rely on mass storage systems to store and retrieve data generated or used by the computer system. File servers (xe2x80x9cfilersxe2x80x9d) are one such set of computer systems that offer the ability to store and retrieve relatively large amounts of data, and to make the data highly available to clients and client devices that wish to access that data. Generally, filers can use magnetic, magneto-optical, or optical mass storage, so as to provide relatively rapid access to data. These types of storage are relatively fast and reliable, particularly when used with fault-tolerant techniques, such as a RAID (redundant array of independent disks) configuration.
Filers, while relatively reliable, are occasionally subject to corruption or loss of data. Known techniques for addressing this problem include maintaining copies of the data in a separate filesystem, one that hopefully will not lose data simultaneously with the original filesystem. These copies can be maintained either (a) in a second filesystem recorded similar to the first, that is, on magnetic mass storage; or (b) on a different type of mass storage medium, such as magnetic tape. The first of these known techniques is sometimes called xe2x80x9cmirroringxe2x80x9d; the second these known techniques is sometimes called xe2x80x9cdumpxe2x80x9d (or xe2x80x9cdump to tapexe2x80x9d). The invention described in this application primarily relates to dump operations, but those of ordinary skill in the art will recognize, after perusal of this application, that the principles of the invention can be applied to other techniques for data backup, including mirroring and related techniques.
One problem in the known art is that the amount of storage recorded at filers has increased dramatically over time. It is presently not uncommon for storage in these devices to be measured in trillions of bytes (Terabytes), and for a dump operation to take many hours, and to use dozens of tapes for storage of an entire filesystem. If there is an error (either due to a problem at the filer, or due to a problem at the tape drive) in the middle of the dump operation, the entire dump operation is aborted and restarted at the beginning. For example, the dump operation sometimes fails before completion due to errors or interruptions, such as errors in writing to tape, power failures, filesystem errors at the filer, user interruption (accidental or otherwise), and the like. If the dump operation is aborted and restarted at the beginning, it is wasteful of the effort already expended during the dump operation that was aborted (and consequently more wasteful the more of the dump operation had been completed by then). Moreover, an error during the dump operation is more and more likely when the amount of information being dumped to tape is larger and larger and the dump operation is longer and longer. Thus, the average time it takes to complete a successful dump (after restarting due to errors, and finally running to completion) increases more than linearly with the size of the filesystem.
Accordingly, it would be advantageous to provide a method for the backup of data that is not subject to the limitations of the known art. This advantage is achieved in an embodiment of the invention that allows partial results of the dump to be preserved, and the dump to be restarted at a point relatively close to where the failure occurred.
The invention provides a method and system for performing a dump operation that preserves partial results of an aborted or interrupted dump, and allows restarting the dump from near where it was stopped. Thus, tapes from the original dump, plus tapes from the restarted dump, can be combined to provide a consistent subset of a filesystem. In a preferred embodiment, the dump operation is performed on a consistent recorded snapshot of the filesystem, so that the subset of the filesystem recorded on the tapes is itself consistent. As an emergent consequence, the dump operation is freely interruptable, restartable, and provides a set of tapes that maintain a consistent subset of the filesystem that is transparent to tape-restore operations and other operations to be performed on the filesystem as it was recorded on tape.
Those of ordinary skill in the art will recognize, after perusal of this application, the many advantages provided by restartable dump. These include, but are not limited to, the following:
The invention increases the likelihood that a dump is completed in a scheduled time window. For example, if the dump operation is scheduled to begin and end within a selected time window, the ability to continue from relatively near where a failure occurred increases the likelihood that the entire dump will be completed before the time window is over.
The invention increases the effective throughput of the dump operation. If a dump stops in the middle, less time is wasted when the dump is restarted; the dump is able to restart from relatively near where it was aborted or suspended.
The invention reduces the penalty of a failed dump. As noted herein, the penalty for a dump that is aborted or suspended is limited, with restartable dump, to only a small portion of the dump that was not xe2x80x9ccommittedxe2x80x9d for restartability.
The invention provides a consistent file system recorded on tape. As noted herein, the set of tapes used prior to the dump being aborted or suspended, plus the set of tapes used when the dump is completed, can be combined to form a consistent filesystem. As an emergent consequence, knowledge of whether the dump was ever suspended and resumed, or simply ran to completion the first time, need not be considered by other elements of the filer.
In a preferred embodiment, a restartable dump allows users to manage backup operations; users can freely suspend and resume dump operations when they deem appropriate. The following are some examples of additional capabilities available to users with restartable dump:
Users can suspend and resume dump operations to avoid time windows of relatively heavy traffic for filer requests.
Users can suspend and resume dump operations to avoid contention for other relatively limited filer resources (such as tape drives).
Users can suspend and resume dump operations to avoid contention with other, higher priority, dump operations.
Designers can build other capabilities into the filer that make use of restartable dump, such making those operations that depend on dump operations also be restartable. Examples include dump copy operations and logical replication operations.
The invention has general applicability to recording of structured information (such as a filesystem) on an alternative medium (such as a tape drive) or in an alternative form (such as BSD dump format), not limited specifically to dump operations in a filer. Moreover, techniques used by a preferred embodiment of the invention for recording of structured information (such as a filesystem) can be used in contexts other than the specific applications disclosed herein.