The present invention generally relates to a hierarchical storage system and, more particularly, to a method of moving files from a primary storage to a secondary storage in a hierarchical storage system.
As a mechanism for accessing data on a tape drive as a file in a file system, the LTFS (Linear Tape File Systems) is in practical use, for example. In the LTFS, a file system is created by maintaining meta-information, which includes a position and a size of a data area that constitutes a file on a tape, for example, as an index. By utilizing the LTFS, a tape may be used as a destination for storing files in a similar way to a storage device such as an HDD and a USB memory.
The LTFS may be utilized without making any modification to an application that is currently using an HDD. If an application designed to use an HDD is operated with the LTFS on an as is basis, however, access to files may take longer than expected, and the application may end the access by timeout. In order to avoid this, the LTFS is sometimes configured as a part of hierarchical storage (HSM: hierarchical storage management), in which a high-speed storage such as an HDD and an SDD is used as a primary storage and a sequential access device such as a tape drive operating on the LTFS is used as a secondary storage, instead of directly using files on the LTFS.
In current HSM systems, it is common to use an HDD as a primary storage, and applications save files on the HDD and move the files to a secondary storage on the LTFS at a particular timing. When a file is moved from the primary storage to the secondary storage, a file called stub, which indicates the existence of the file moved to the secondary storage, is created on the primary storage. When the stub is accessed, the corresponding file on the secondary storage is read and moved to the primary storage for responding to the access.
For example, a file may be moved from the primary storage to the secondary storage when the usage rate of the HDD exceeds a specific threshold, or at a time specified by a user. In case a file is moved on the basis of a threshold, all the files whose stub is not created on the primary storage could be moved. In order to reduce a response time for reading a file from an application, however, only the minimum number of files needed for reducing the usage rate of the HDD below the threshold may be moved so that as much number of files as possible are left on the primary storage.
In that case, it is common to utilizing the LRU algorithm for selecting files so that, the least frequently accessed file is selected among the files whose stub is not created on the primary storage. This method of selecting files by utilizing the LRU algorithm works well when CD-R, DVD, or the like is used as the secondary storage. If a tape/tape drive is used as the secondary storage, however, a read time of a file differs according to the position of the file on the tape. In other words, there are some files that may be read from the tape in relatively short time and others that are read from the tape in relatively long time.
When writing data on a tape, a tape drive that meets the LTO standard or the like writes data in a longitudinal direction of the tape while moving the tape back and forth for many times. When the tape is mounted on the tape drive for reading data, it takes a shorter time to access the beginning of a file that is written on the tape from a position close to a leading end of the tape in the longitudinal direction. On the other hand, it takes a longer time to access the beginning of a file that is written on the tape from a point close to a trailing end of the tape in the longitudinal direction. As a result, some files may be read from the tape in relatively short time while others are read from the tape in relatively long time, as described above.