In some file systems, such as New Technology File System (NTFS) supported under Microsoft's Windows™ operating systems, a feature called “reparse points” is provided, which may permit storage of infrequently used or accessed data of a file in a long-term storage (such as tape or optical media), and replacing the file with another file including information relating the location of the migrated data. A number of different types of reparse points may be supported natively by the file system, and it may also be possible for applications to generate new types of reparse points to support application-specific features. For example, in environments that employ a hierarchical storage management (HSM) system, files that have not been accessed for a long time may be moved to a long-term storage, and a reparse point may be associated with the file name. If an access to the file is then attempted, the file system may examine the reparse point to look up the actual location of the file within the hierarchical file system, and retrieve the file contents from that location in the long-term storage.
Traditional data backup or replication techniques, such as making exact replicas of files and/or directories, may not work well for performing replication of files including reparse points or files having data that are retrieved based on reparse points. For example, if an HSM system has placed the contents of a file in a long-term storage, and associated a reparse point with the file name, and a conventional replication application accesses the file for copying, complete contents, of the file may be first retrieved from the long-term storage, and then the retrieved file may be copied to a replica or backup server. Such a retrieval (before replication) may significantly delay replication, especially for large files, and in some cases users may not even have intended to backup files that have already been archived. Furthermore, the replica server may not be configured to support HSM. In other words, the replication process involving complete retrieval and copying of data related to reparse points may be very slow and data-traffic intensive, thereby negatively impacting the performance of the replication system.
Thus, to address the above-discussed problems, it is desirable to develop and provide an improved replication process for reparse points that may be implemented on wide range of operating systems without introducing unnecessary delay in the replication process.