Changing the representation of data in a file has many useful advantages. For example, compressing infrequently used files can save significant storage space. Encrypting an important file can secure secret information. However, the responsibility for remembering that some files require decompression, decryption, or some other kind of special handling before they can be used for normal reading and writing rests with the user. Failing to perform the special handling in the proper way wastes time. For example, failing to decompress a file before launching a process that reads the file wastes time by causing error conditions that must be cleared before processing can continue.
Another area involving files that require special handling is the area of file migration. File migration addresses the problems of how best to use multiple data repositories, for example on networks of workstations. Users frequently run out of space on the data repositories connected to their computers, and then must spend time deleting the less frequently used files, or, worse, users are unable to add significant amounts of important data to their systems because of insufficient space. In some situations, data will be lost because available storage space has dipped below a critical level at the same time a user is attempting to add data to his system, and the system has no means of recovery other than discarding buffers and killing processes.
One approach to solving this problem is to store frequently used data on high speed magnetic disk devices and store less frequently used data on slower but cheaper types of storage media or on removable storage media such as tapes or optical disks. This approach is only a partial solution to the storage problem and creates additional problems. For example, someone, either a user or a system administrator, is required to spend time moving the files, and each user has to keep track of which of his or her files have been moved and where they have been moved.
One proposed method of managing data repositories for networks of workstations and fileservers automatically moves files from high-speed storage to low speed storage. This process is called "migration" or "staging," and the repositories that receive the migrated files are called "migration stores." Using migration requires that there be some means of keeping track of which data files have been migrated to migration stores and an identification of the location of the files. Some existing systems use approaches that store this information in files that acts as catalogs of migrated files. This approach suffers from a number of problems. These systems have difficulty recovering from some types of common system or disk errors, and their implementation requires modifications to file system utilities.
One such system is described in Israel et al. Evolutionary Path To Network Storage Management. USENIX, Winter '91 pp.185-198. That article describes a staging method wherein each workstation file system has a file that is a catalog of the migrated files. The catalog lists all migrated files by their inode numbers. If the catalog file is lost, all migrated data files are potentially inaccessible to the workstation. Secondly, this approach requires rewriting some file system utilities. For example, some file system backup and restore utilities read or write directly from or to the raw disk. When a backup tape that was made using such a backup utility is restored the inode numbers of most of the files usually change, making the catalog file useless and the migrated files inaccessible. Consequently, to use this migration approach, the backup and restore utilities must be modified.
In other systems, such as BUMP and Novell 4.0, the underlying file system itself is modified. For example, the inode data structure can be modified to include an extra bit that identifies an inode as corresponding to a migrated file. This approach suffers from being highly unportable, and requiring rewriting an extensive amount of the file system and file system utilities.
It is an object of the invention to provide a file management system for effectively dealing with compressed files, encrypted files, migrated files, or any other data files that require special handling, in a way that is transparent to users, to application programs, and to system utilities, such as backup and restore packages.
It is an object of the invention to provide a file migration system that is free from single points of failure and that does not depend on file system details, such as inode numbers, for critical operation.
It is a further object of the invention to provide a file migration system which contains information indicating the files that have been migrated the information identifying the location of the contents of the original data file.
It is another object of the present invention to provide a file migration system wherein the migration system operating software determines whether a file on a primary data repository represents a migrated file simply by examining its attributes.
It is a further object of the invention to provide a file migration system wherein the migration system operating software determines whether a file represents a migrated file by first examining at least one normal attribute of the file and then, if necessary, examining the contents of the file.
It is yet a further object of the present invention to provide a file management system supporting the use of extended attributes.