Conventionally, files and directories in a storage subsystem can be backed up with file-level operations. File-level backups build individual files and directories on backup storage (e.g., tapes) by going through a file system, which typically employs hierarchical storage structures. File-level backup techniques back up data on a file-by-file basis, because a file is the smallest addressable unit of data that the backup software can handle. File-level backup techniques and protocols generally have limited backup performance due to various file system overheads. For example, a backup operation for small files, dense directories, or fragmented file locations generally involves small reads and random disk access, which in turn incur a significant file system overhead.
Further, with file-level backup techniques, the files often have to be backed up in a certain order, such as inode-based ordering and directory tree based ordering. For each file, file-level backup techniques have to backup the data from the beginning to the end. The constraint imposed by the ordering limit the performance. For example, the dump format of Berkeley Software Distribution (BSD), further imposes strict ordering constraints among files, as well as data blocks of a file. A “block”, in this context, is the smallest amount of contiguous data that can be addressed by a file system.
Additionally, file-level backup techniques are often unable to provide a sufficient data input rate to a tape drive, which causes a shoe-shining effect to occur. The shoe-shining effect occurs during tape reads or writes, when the data transfer rate falls below a minimum threshold at which the tape drive heads are designed to transfer data to a running tape. When the shoe-shining effect occurs, the tape drive stops, rewinds back the tape, accelerates again to a proper speed, and continues writing from the same position. The shoe-shining effect significantly reduces the backup performance.
Other problems with file-level backups also exist. For example, file-level backups do not preserve metadata used by the storage system. Although a restore operation will restore user data, it cannot restore the metadata in the original volume. Loss of the metadata may result in loss of the functionality that users may have on the original volume.
Another type of backup technique is block-level backup, also called image-based backup. Block-level backup techniques generally allow for better performance than file-level backups. A block-level backup creates a backup image in a backup storage facility by using blocks as the smallest addressable unit of the backup software, rather than files (a file typically includes numerous blocks). An example of a product which can perform block-level backup and restore is the SNAPMIRROR® TO TAPE™ software made by NETAPP®, Inc. of Sunnyvale, Calif. In general, block-level backup and restore can be performed faster than file-level backup, because a block-based backup operation does not need to go through a file system in order to create or restore a backup. Further, reads at the block-level are performed sequentially in terms of physical blocks on disk, which reduces latency.
A disadvantage of known block-level backup techniques, however, is that they do not provide the ability to restore only a single selected file or selected files from a backup image. This is because the backup software is not aware of the file structure of the data in the backup image. Consequently, with known block-level backup techniques it is necessary to restore the entire backup image (e.g., an entire volume), including all files contained in it, even if the user only wants to restore a single file from that image. This is a very resource intensive process and, depending on the size of the backup image, it can take a long time to complete (hours or even days). In addition, known block-level data techniques do not provide the ability to create and restore from an incremental backup.
Further, known block-level backup techniques, such as associated with network file system (NFS) or common Internet file system (CIFS), are client-side-only (local) backup techniques. On the other hand, client-server backup protocols such as network data management protocol (NDMP) are designed to support file level backup only and thus do not have the ability to perform block-level backups or restores.