1. Field of the Invention
The present invention relates to a hierarchical data storage device that uses a library device of a storage medium (sequential access storage medium) such as a magnetic tape or the like on which data is stored through sequential access in an information management system, and particularly to a method of extracting only valid data from data in a sequential access storage medium in order to rearrange the extracted valid data in another sequential access storage medium.
2. Description of the Related Art
As conventional information management systems, hard disk devices are mainly used because hard disks allow for large capacity access at a high speed. In order to cope with situations in which data stored in hard disk devices is lost, data is stored on magnetic tapes or the like for backup purposes. Some standards for such magnetic tapes have been defined. For example, in a standard named LTO (Linear Tape-Open), a cartridge that covers one particular magnetic tape is designed to be smaller than that covering other magnetic tapes, and eight heads are used for reading and writing data, thereby achieving data access at high speed.
As a hierarchical storage system based on information life cycle management began to be realized, devices that use virtual magnetic tape library devices as a part of their hard disk devices started to be developed. Therefore, methods have been invented in which magnetic tape media, instead of being used for the conventional purposes of backup, are used as logical volumes in a unit including a plurality of tape media.
Magnetic tape is a storage medium in which data is stored through sequential access. Accordingly, when updated data is written, data that is not updated becomes invalid, and areas that are being used unnecessarily arise. When the amount of invalid data increases in a set of magnetic tapes, the area available for newly storing data decreases, and thus a greater number of magnetic tapes are required, which is problematic in view of cost.
In this document, large capacity storage media such as the magnetic tapes mentioned above or the like for data writing/reading through sequential access are referred to as “sequential storage media”. In contrast, storage media such as the above hard disk devices for data writing/reading through random access are referred to as “random storage media”.
As a method of solving this problem, a method called “garbage collection” has been suggested in which invalid data is detected on the basis of history information of data recorded on a magnetic tape that comes from data recorded on the magnetic tape, and data from which the invalid data has been removed (valid data) is recorded on a new magnetic tape (Patent Document 1).
Patent Document 1
Japanese Patent Application Publication No. 2006-31446
When the above method in Patent Document 1 is implemented, notwithstanding the fact that one more magnetic tape in use (referred to as constituent tape) can be used as the newly prepared magnetic tape, the new magnetic tape cannot be handled as a blank storage tape unless all the pieces of valid data stored on the magnetic tapes that are process targets become invalid (in other words, the magnetic tape cannot be handled even if only one piece of data remains valid); accordingly, this makes it difficult to reduce the number of tapes in use, which is problematic. Also, pieces of data that were sequentially stored on one magnetic tape are discretely stored on a plurality of magnetic tapes due to the garbage collection process (pieces of valid data are discretely stored). This discreteness sometimes causes performance deterioration when sequential reading is executed on logical volumes.
Accordingly, it is necessary to realize an optimization mechanism for rearrangement of data performed in the garbage collection process for storage media storing data through sequential access.