1. Field of the Invention
This invention relates to computer systems and, more particularly, to backup and restoration of data within computer systems.
2. Description of the Related Art
Many business organizations and governmental entities rely upon applications that access large amounts of data, often exceeding many terabytes of data, for mission-critical applications. Numerous different types of storage devices, potentially from multiple storage vendors, with varying functionality, performance and availability characteristics, may be employed in such environments.
Any one of a variety of factors, such as system crashes, hardware storage device failures, software defects, or user errors (e.g., an inadvertent deletion of a file) may potentially lead to data corruption or to a loss of critical data in such environments. In order to recover from such failures, various kinds of backup techniques may be employed. For example, in some storage environments, file-level replication may be employed, where a complete copy of the set of files in one or more file systems at a primary host may be created at a secondary host. Along with the files, copies of file attributes or metadata (e.g., file size, creation time, etc.) may also be stored in the replica. If the primary host fails, or if the file system at the primary host becomes corrupted or unavailable, the files and their attribute values may be recovered or restored by copying from the replica.
Some modern file systems may implement extensibility features that support enhanced functionality (such as the ability to mount volumes, or to transparently use hierarchical storage for seldom-used files as described below) for certain files or directories, beyond the functionality typically provided for ordinary files and directories. Special file system metadata in the form of extensibility records or attributes may be used to identify the files and directories for which the enhanced functionality is supported, and to store configuration information for the extended functionality. Such extensibility records may traditionally not be handled appropriately (or may be ignored) by backup systems for a variety of reasons.
For example, in some versions of file systems (such as NTFS) supported under Microsoft's Windows™ operating systems, a feature called “reparse points” is provided, which may permit file system redirection or special data interpretation. A number of different types of reparse points may be supported natively by the file system, and it may also be possible for applications to generate new types of reparse points to support application-specific features. Two common uses for reparse points in traditional systems include mount points for volumes and migration tags for files. For example, the file system may indicate that a volume is mounted at a particular location (e.g., a directory path) by associating a reparse point with a directory. When an access is attempted to the contents of the directory, the file system may retrieve the reparse point and determine the physical location of the mounted volume so that I/O to the volume may be performed. In addition, in environments that employ a hierarchical storage management (HSM) system, files that have not been accessed for a long time may be moved to archival storage, and a reparse point may be associated with the file name. If an access to the file is then attempted, the file system may examine the reparse point to look up the actual location of the file within the hierarchical file system, and retrieve the file contents from that location. Typical end-users may be unaware of the existence of reparse points, and the attributes or data structures used by the file system to implement the reparse points may not be visible to end-users using traditional file system navigation tools. Special kernel-level entities such as file system filter drivers may be configured to recognize the existence of the reparse points and to take the appropriate actions (such as loading file data from a hierarchical storage management system's archival storage when the file is accessed) for different applications. Extensibility features similar to reparse points may be supported by a number of file systems and other storage management services used with a variety of operating systems.
Traditional backup techniques, such as making exact replicas, may not work well for storage objects that have such extensibility features enabled. For example, if an HSM system has placed the contents of a file in archival storage and associated a reparse point with the file name, and a conventional replication manager accesses the file for copying, an attempt to read the contents of the file from archival storage may result. Such a retrieval may significantly delay replication, especially for large files, and in some cases users may not even have intended to backup files that have already been archived. Furthermore, the secondary host or replication target may not be configured to support HSM. If the reparse point is recreated at the replica and an attempt to access the replica of the file is made, a lack of adequate HSM support may result in failures or in unpredictable behavior. Similar problems may arise in backing up storage objects with other kinds of extensibility features enabled. One response to these problems in some traditional backup systems has been to avoid backing up objects that have the extensibility features enabled. However, ignoring or avoiding backing up the objects may result in incomplete restoration capabilities: e.g., it may not be possible to fully restore a source set of objects (e.g., files and directories of a file system) to the state they were in prior to the backup if information on the extensibility features is not retained.