Data network technology permits multiple users to share economically access to files in a number of file servers. Problems arise, however, in the assignment of files to particular servers. For example, it may be desirable to move a file from one file server to another when a new server is added to the network. A “CDMS” brand of data migration service provided by EMC Corporation of Hopkinton, Mass., can be used to move files from one file server to another while permitting concurrent client access to the files.
In a typical use of the EMC Corporation “CDMS” brand of data migration service, respective IP addresses are assigned to file systems. In order to migrate a file system from a source file server to a target file server, the IP address of the file system is reassigned from the source file server to the target file server, the file system is then mounted as a read-only file system in the source file server, and then the migration service is invoked in the target file server. The migration service in the target file server then begins a background process of using a protocol such as NFS, CIFS or HTTP to copy the files and directories from the read-only file system in the source file server to a temporary migration inode data structure in the target file server. The migration service in the target file server concurrently responds to client requests for access to a directory or file in the file system on a priority basis by checking the migration inode data structure to determine whether the directory, file, or portion of the file to be accessed has already been copied from the source file server to the target file server, and if so, then accessing the directory or file in the target file server, and if not, by fetching the directory or file from the source file server, storing the directory or file in the migration inode data structure in the target file server, and accessing the directory or file in the target file server. Once the entire file system has been copied from the source file server to the target file server, the migration inode data structure is converted to a conventional inode data structure for the file system in the target file server, and then the read-only version of the file system is deleted from the source file server.
The EMC Corporation “CDMS” brand of data migration service can be used to migrate a file system within the same file server. In other words, a read-only file system can be copied within the same file server while permitting concurrent read-write access to the copy of the file system. In addition, the EMC Corporation “CDMS” brand of data migration service can migrate specified files of a file system, or a specified extent of data within a file.
Files are also often moved between file servers in order to relocate infrequently accessed files from expensive, high-speed disk storage to more economical but slower mass storage. When a client needs read-write access to a file in the mass storage, it typically is moved back to the high-speed disk storage, and then accessed in the high-speed disk storage. This kind of migration of files between levels of storage in response to client requests based on file attributes such as the time of last file access and last file modification is known generally as policy-based file migration or more specifically as hierarchical storage management (HSM). It is desired for such policy-based or hierarchical storage management to be transparent to the clients, yet in practice there is always a trade-off between cost of storage and delay in file access.
In a system employing hierarchical storage management, when a file or data blocks of a file are moved from a primary file server to a secondary file server, the file in the primary file server is typically replaced with a stub file that contains attributes of the file and a link to the new file location in the secondary file server. The stub file can be accessed to redirect an access request from a client to the new file location in the secondary server, or to migrate data from the present file location back to the primary file server. This stub file can be a symbolic link file in a UNIX-based file system, or a shortcut file in a Microsoft WINDOWS file system. In a computer using the Microsoft WINDOWS operating system, access to a stub file may automatically result in access to the new file location. For example, an attempt to execute or open a shortcut will cause the Microsoft WINDOWS operating system to execute or open the target of the shortcut.
A snapshot copy of a production file system contains the state of the production file system at a respective point in time when the snapshot copy is created. A snapshot copy facility can create a snapshot copy without any substantial disruption to concurrent read-write access to the production file system. Such a snapshot copy facility, for example, is described in Keedem U.S. Pat. No. 6,076,148 issued Jun. 13, 2000, incorporated herein by reference, and in Armangau et al., U.S. Pat. No. 6,792,518, incorporated herein by reference. Snapshot copies have been used for a variety of data processing and storage management functions such as storage backup, transaction processing, and software debugging.
Users are becoming less tolerant of delays in accessing their data, and even less tolerant of corruption of their data. Therefore, there has been a continuing interest in improving data availability and the effectiveness of recovery procedures. For example, after recovery, the integrity of the recovered file system is checked, and if a defect is found, an attempt is made to correct it. In addition, it is often possible to recover some of the data that was written to the production file system since the creation of the latest read-only version, for example, by replay of a log from an application program.