Network data storage is typically provided by an array of disk drives integrated with large semiconductor cache memory. A file server is used to interface the cached disk array to the network. The file server performs mapping of a network files to logical block addresses of storage in the cached disk array and move data between a network clients and the storage in the cached disk array. The file server use a network block services protocol in a configuration process in order to export to the network client logical volumes of the network-attached storage, which become local pseudo-disk instances. See, for example, Jiang et al., Patent Application Publication US 2004/0059822 A1 published Mar. 25, 2004, entitled “Network Block Services for Client Access of Network-Attached Storage in an IP Network,” incorporated herein by reference. Network clients typically use a network file system access protocol to access one or more file systems maintained by the file server.
Data network technology permits multiple users to share economically access to files in a number of file servers. Files are also often moved between file servers in order to relocate infrequently accessed files from feature-rich, expensive, and highly-protected high-speed disk storage to more economical and possibly slower mass storage. In such a system, the high-speed disk storage is referred to as primary storage, and the mass storage is referred to as secondary storage. When a client needs read-write access to a file in the secondary storage, the file typically is moved back to the primary storage, and then accessed in the primary storage. This kind of migration of files between levels of storage in response to client requests based on file attributes such as the time of last file access and last file modification is known generally as policy-based file migration.
In a data processing network employing policy-based file migration, a client typically accesses a primary file server containing the primary storage, and the secondary storage is often in another file server, referred to as a secondary file server. When a file is moved from a primary file server to a secondary file server, the file in the primary file server is typically replaced with a stub file that contains attributes of the file and a link to the new file location in the secondary file server. The stub files can be accessed to read data from the secondary storage in response to client read and write requests.
When a file on a primary storage is replaced by a stub file, the data contained in the file can be archived to secondary storage in the original form or it can be transformed in a different form that is not directly accessible to the clients of the file server. Compression and encryption are two such examples of data transformations that are possible.
In one example, File level Redundant Data Elimination (F-RDE) permits file server to increase file storage efficiency by eliminating redundant data from the files stored in the file system. It provides file server the ability to process files in order to compress them and only share the same instance of the data if they happen to be identical on a per file system basis. The process of eliminating duplicate copies of the same file data and employing compression on all data that gets transferred to the F-RDE Store is called space reduction and files transferred to RDE store in that form are called space reduced files. Encryption is another way in which the data associated with a file can be transformed such that it is stored in a form that the client cannot use directly.
When a client of a file server attempts to read the archived or transformed file, the file server recalls the requested file data from secondary storage according to various policy supported by the policy-based file migration. Under one such partial recall policy, when a read request from a client is for a block of data in the middle of the file, the file server recalls complete data contained in the file from the beginning of the file to the offset in the file that is requested by the client. When a write request comes from the client for an archived file or a transformed file, file server reads the entire file back to the primary file system before allowing the write request to complete.
Access to archived files or transformed files in a manner described above is considerably slower than access to files from a primary storage. As a result reading data from secondary storage or from a transformed file suffers from high latency. Additionally writing data to archived file or transformed file causes unnecessary space consumption because entire file data is recalled to primary storage before the write operation can complete.
The storage technology described above, in combination with a continuing increase in disk drive storage density, file server processing power and network bandwidth at decreasing cost, has provided network clients with more than an adequate supply of network storage capacity at affordable prices. Reducing the time it takes to read data from the file or write data to the file and reducing the space required to write data to file would be advancement in the data storage computer-related arts. This is becoming increasingly important as the amount of information being handled and stored grows geometrically over short time periods and such environments add more file systems and data at a rapid pace.