The present invention relates to the efficient use of metadata accompanying file writing to media.
More specifically, the present invention relates to the efficient use of metadata accompanying the writing of files to media by a file system when the metadata is provided in the form of a list.
A file system known as a linear tape file system (LTFS) has recently been developed as a new way of using magnetic tape.
In LTFS, software (S/W) works with hardware (H/W) to enable tape to be accessed via a file system interface. LTFS has open specifications. The LTFS format calls for the partition of a tape cartridge into two partitions: an index partition (IP) and a data partition (DP). The index partition (IP) and the data partition (DP) are established in separate locations on the tape medium. Metadata such as file allocation information is recorded in the index partition (IP). The main body of data is recorded in the data partition (DP). The following is background on the establishment of these partitions.
File allocation information is frequently updated, and tape is a typical example of a sequential access device, so data is always being appended and the allocation information is typically recorded at the end when there is a single partition. Therefore, when a tape cartridge is mounted, the mounting process can take a long time because information recorded at the end always has to be retrieved.
In LTFS, when a tape cartridge is unmounted, the metadata is overwritten at the beginning of the index partition so that, during mounting, the metadata can always be retrieved from the index partition.
In addition to writing metadata in the index partition, the metadata is written in the data partition. When metadata has been written to the data partition, but the updated metadata is not overwritten in the index portion (e.g., because of a sudden power outage), the metadata recorded in the data partition can be used to mount the tape again even though the process takes more time.
The index partition is typically configured to be able to store a small amount of data, and data to be retrieved at the time of mounting is written not only in the data partition but also in the index partition. As a result, such data can be quickly retrieved from the index partition.
FIG. 1 is a diagram showing an example of information recorded in a typical tape cartridge. In this example, a specially designated file (File B) and metadata (Metadata 3) are recorded in the index partition, and metadata (Metadata 1, Metadata 2, and Metadata 3) is recorded in the data partition, in addition to the other data (File A, File B, File C, and File D).
Here, Metadata 1 and Metadata 2 are old metadata. Because information is basically appended on the tape, that information is stored without being overwritten.
The timing for writing metadata to the data partition can be specified in LTFS by settings, such as at the time a file is closed or after a predetermined amount of time has elapsed, in addition to when explicitly specified by an application (for example, when FlushFileBuffers( ) is called, which is a known API). One reason for this is because when a large amount of data is written to the data partition without writing metadata to the data partition, and a sudden loss of power occurs, all of the data written after the last recorded metadata is lost.
FIG. 2 is a diagram showing an example in which only data has been written to the data partition. As shown in FIG. 2, information is recorded on the tape when Files A, B, C and D have been written to the data partition, after a tape cartridge has been formatted and mounted. As Metadata 1 is the metadata added immediately after formatting, it does not have any subsequently written file information. If power is lost at this time, before unmounting the tape, the system cannot determine where the data in Files A, B, C and D is stored, and so all of the files are lost.
Applications assuming use of USB memory and HDD are typically not designed to call up an API, such as FlushFileBuffers( ), after the data has been written. Therefore, in LTFS, in order to minimize data loss during a sudden power outage, the recording of metadata to the data partition every five minutes is recommended, which is also typically the default setting.
Since metadata written to the data partition is not overwritten, a tape cartridge can be mounted using old metadata in LTFS. This function is called rollback, and rollback can be used to return to a snapshot taken at a previous point in time.
In the LTFS format, the metadata contains a generation number, and the mechanism used to determine the location of the metadata on the tape during the performance of rollback is the recording of a pointer to the previous metadata (more specifically, to the block number on the tape to which the previous generation has been written). This pointer is referred to as a “back pointer”.
FIG. 3 is a diagram showing generation numbers of metadata and back pointers. In the example shown in FIG. 3, the metadata for Generation No. 3 has a back pointer to the metadata for Generation No. 2, and the metadata for Generation No. 2 has a back pointer to the metadata for Generation No. 1. There is no back pointer in the metadata for Generation No. 1 because it is the initial metadata.
FIG. 4 is a diagram showing file marks before and after metadata. In the LTFS format, as shown in FIG. 4, delimiting information called file marks are recorded before and after metadata when metadata is recorded on the tape.
All metadata can be accessed on the tape using the following methods in accordance with these provisions: 1) Back pointers are traced in sequential order from the final metadata recorded on the tape; and 2) File marks are repeatedly located from the beginning of the tape, and the metadata between file marks is retrieved.
In LTFS, the first and second methods are typically used to list all of the metadata written in the data partition for the user, and an interface is prepared that allows the user to select the generation to be used in a rollback process.
FIG. 5 shows an example of a rollback generation selection screen and interface (or metadata selection screen in some implementations of LTFS). When the amount of metadata recorded on the tape is large, the first and second methods can require a lot of time to display the list of metadata. When metadata is written to the data partition on a regular basis to prepare for a sudden power outage, there can be more metadata than the user expects, which can cause the following problems when rollback is performed, such as the process taking much longer than expected, and difficulty in determining which generation of metadata should be selected.