The present invention relates to tape-based data storage, and more particularly, to storing data and an index in different partitions on a magnetic recording tape.
Data storage drives, such as data tape drives, record information to and read information from media, such as the data tape of a tape cartridge. Data storage drives are often used in conjunction with, for example, a data storage and retrieval system. One example of such a system is an automated data storage library with robotic picking devices, wherein removable media cartridges are selectively transported between storage cells and data storage drives in an automated environment. Herein, automated data storage library, data storage library, tape library system, data storage and retrieval system, and library may all be used interchangeably.
A digital storage tape may contain multiple files. Files and data stored on tape are written to the tape sequentially, in a linear fashion. Unlike hard drives or solid state nonvolatile storage such as nonvolatile memory (NVM), tape does not allow direct-access write of data. In general, tape data can only be written linearly, in append-only mode. For example, the Linear Tape-Open (LTO) standard uses shingling to write tracks to increase tracks density. However, due to shingling, the in-place rewrite of a file or a data block stored in one track would destroy what has been written in the neighboring track.
File management of data on tapes has traditionally been different from that of direct-access storage media. In the latter, file system data structures are commonly used, keeping information such as a hierarchical directory structure, file names, file attributes (e.g. size, access information, access rights permissions), and a list of the physical storage blocks containing the file contents, etc. However, since such file system structures must be updated with information when any changes are made to files stored on the media, such file system structures are not well-suited to tapes, which do not allow rewrite of the file system information. While tape-based file system implementations do exist, reading the file system information requires positioning the tape to the end of the recorded data, and any update requires rewriting of a new copy of the entire set of file system structures at the end of the tape data.
One common approach to managing data on tape requires a storage system to manage the tape while storing a separate index of the tape content on an unrelated disk device or other remote direct-access storage media. For example, tape is no longer self-describing. Data stored on the tape cannot be accessed because the tape file index is left in the storage system's database, once the tape is taken out of the scope of the storage system, The longevity of the data is limited by the longevity of the storage system, including all its software, databases and hardware it is running on. Hence, while the tape media may preserve the bits intact for years, there is no guarantee that the files will survive as long since data on tape may no longer be interpretable and restored as files.
Another approach to storing files on tapes is via utilities such as TAR (Tape ARchive). The TAR program combines a set of source files into a single data set which is written to tape. The TAR file may include a header, which describes the TAR file contents and retains file metadata, and the body of the TAR file which may include the source files concatenated together. The TAR program makes the tapes self describing which avoids the dependency on an external index. However, TAR files are not appendable once written. An appended tape, therefore, may include several TAR files. Indexing such a tape will require multiple seeks and reads. Also there is the risk of data loss if a TAR file header is corrupted or its format becomes obsolete. Since the source files are concatenated in the data area, the TAR file header is required to determine the source file boundaries.