As is apparent from the description below, the present invention is advantageously applicable to the migration of data, managed by the HSM (Hierarchical Storage Management) function in an environment in which data is hierarchically laid out on multiple storage devices by the HSM function, to another HSM execution environment. The present invention is advantageously applicable also to a device or software for migrating data, managed by the HSM function, to another storage device not included as a component of the HSM function. The following describes the background of the present invention, beginning with the outline of HSM.
When data is stored in storage devices, the storage management method, called HSM (Hierarchical Storage Management), is used with consideration for the following characteristics of storage devices.
Performance
Price
Capacity and
Data saving and additional function
This method combines multiple different storage devices hierarchically to select storage devices, where data is optimally stored according to the usage status of the user and the system, and automatically relocates data on the storage devices.
HSM is used primarily in a storage system where a relatively large mount of data, shared among multiple servers or users, is stored.
When HSM is used and thereby multiple storage devices each having different characteristics are well combined according to their purposes, it is known that the storage device cost and the storage device operation cost can be significantly reduced as compared when only one storage device is used.
An example of a typical HSM configuration that reduces the cost of storage devices is a combination in which low-cost tape storage, with low random access performance but suitable for archiving data, is used as the lower-level storage of high-cost disk storage with high random access performance.
In such a configuration, data has the following general characteristics.
“All data stored in storage is not accessed evenly but data is less frequently accessed as the time elapses after the data is created, and the data access frequency depends on how recently the data was created.”
Considering this fact, the data relocation scheme (mechanism) described below is included into the storage operation system.
The scheme is described as follows.                All newly created data is once stored in disk storage and, depending upon the data access frequency,        Data more likely to be accessed again is kept stored in disk storage but data less likely to be accessed is automatically relocated on tape storage.        
Including such a scheme into a storage management system enables a large-capacity, low-cost storage system to be built using small-capacity disk storage and large-capacity tape storage while maintaining the service level comparable to that of a storage system composed only of disk storage.
From the characteristics of the device configuration described above, there are two types of storage devices which are called as follows:                A storage device corresponding to, and working like, disk storage is called “primary storage (1st tier storage)”.        A storage device corresponding to, and working like, tape storage is called “secondary storage (2nd tier storage)”. Storage devices are selected for the primary storage and the secondary storage according to the management purposes.        
Typically, HSM has the following two functions:                the function to hide the relocation of data, from primary storage to secondary storage, from a machine that accesses the data and, after the data relocation, to still provide the data access environment equivalent to that before the relocation, and        the function to select data to be relocated and to automatically relocate the data according to a predetermined rule.        
The function to provide the data access environment equivalent to that before the relocation, which is the first function, is implemented by replacing a file, stored in the primary storage, by a file called a “stub file” when data is relocated from the primary storage to the secondary storage.
A stub file is a several-KB file containing data indicating the address in the secondary storage where the data is stored.
For example, a stub file includes the following file attribute information:                File name        File size        Access control informationThis file attribute information is inherited from a file to be relocated.        
A stub file is created so that file operations other than the read/write operation on the data relocated on the secondary storage can be performed by the operation on the stub file on the primary storage.
When a read/write request for data relocated on the secondary storage is issued from a data access source, the primary storage reads data from the secondary storage on behalf of the data access source and, thereby, completely hides, from the data access source, the relocation of data to the secondary storage.
Those functions are provided as an extended function specific to the file system of the primary storage (that is, the extended function provided in the file system but not accessed by a standard file system call). As long as a stub file is processed by a standard file system call specified by a file access request, the data access source can process the stub file using the function described above as if the stub file was a file equivalent to the file before the data relocation.
The function to automatically relocate data, which is the second function, is provided in:                Primary storage or        Management software integrally managing primary storage or secondary storage        
This relocation function relocates data by performing the following sequence of processing.
(A1) Acquire attribute information on files, stored in the primary storage, at a time scheduled in advance by the system manager.
(A2) Extract files that satisfy the relocation condition specified by the system manager in advance.
(A3) Relocate the data of the extracted files in the secondary storage.
(A4) Create stub files for data relocated in the primary storage.
This relocation function also performs the following operation for data already stored in the secondary storage.
(B1) Extract files satisfying the condition defined in advance by the system manager.
(B2) Relocate data in the primary storage again.
Note that, the relocation operation between the primary storage and the secondary storage must determine, during its execution, which type the file to be relocated is:
a file storing data replaced by a stub file or
an ordinary file.
However, an inquiry issued via a standard file system call cannot determine the type of a file.
Nor can a standard file system call read address information on data, stored in secondary storage, from a stub file.
To solve this problem, a special interface is provided as an extended function specific to the file system of primary storage to allow the user to:                Acquire information used to determine whether the file is a stub file and        Read address information from the stub file.        
In addition, this interface (a special interface provided as an extended function specific to the file system of primary storage) allows the user to:                Read and update file data without updating the time information attribute that would be changed by a usual data access request that is issued to read or update file data and        Set attribute information that cannot be set by a standard system call.        
As described above, the following two functions                Function to provide a data access environment equivalent to that before the data relocation and        Function to automatically relocate datause the extended function specific to the file system of primary storage to perform the following HSM operations.        Hide the data relocation from a data access source and        Automatically relocate data based on the relocation rule predetermined by a system manager        
For files such as stub files, Patent Document 1 given below discloses a method for using HSM to smoothly move the files from one file server to another and, after the files are completely moved and the source file server does not exist any more, to reduce the time required for a client to access the files. That is, Patent Document 1 discloses a method for transferring a set of files, comprising the steps of:
receiving, by a destination file server, metadata related to the set of files and a stub file,
updating location components in the destination file server to maintain a list of repository nodes related to each file of the set of files,
replacing each stub file by a full content of a corresponding file related to the stub file, and
when a client request for a specific file in the set of files is received while replacing stub files but if a full content of the specified file is not yet transferred, replacing the stub file corresponding to the specified file by the full content of the specified file,
wherein a task of replacing the stub file for the specified file has priority higher than that of a task of replacing the stub file for a file not requested.
[Patent Document 1]
Japanese Patent Kohyo Publication No. JP-P2005-538469A