1. Field of the Invention
The present invention relates generally to computers, and more particularly to mechanisms for writing data to a tape medium in a computing environment.
2. Description of the Related Art
The present invention relates to an apparatus and method for writing data to a tape medium. Particularly, the present invention relates to an apparatus and method for writing reception data received from a higher-level apparatus to a tape medium in response to a synchronization instruction for writing the reception data to the tape medium.
A tape drive for writing data to a tape medium such as a magnetic tape is generally configured to temporarily accumulate the data in a buffer and to write the data to the tape medium from the buffer at a predetermined timing. Such writing from the buffer to the tape medium is called “synchronization of writing” or “buffer flush” (hereafter, simply referred to as “synchronization”).
When the synchronizations are performed without stopping the tape medium, a large gap is formed on the tape medium between data written in a preceding synchronization and data written in a subsequent synchronization. As a result, a recording region of the tape media is wasted. Accordingly, there is a need to perform backhitch to write the data in the subsequent synchronization without a very large gap formed after the data written in the preceding synchronization. Backhitch is an operation of temporarily stopping the tape medium by reducing the running speed, and rewinding the tape medium up to a position where writing is to be performed. Thus, the synchronization consumes much time due to this backhitch.
In this respect, Recursive Accumulating Backhitchless Flush (RABF) is proposed as a technique for avoiding such situations in U.S. Pat. No. 6,856,479, and U.S. Pat. No. 6,865,043, for example. As described in these documents, RABF may be understood as the following. To be specific, upon receiving a synchronization request, a tape drive firstly writes data, accumulated in a buffer and not yet written to a tape medium, to a temporally recording region (ABF wrap) reserved on the tape medium while running the tape medium. Note that this writing is a buffer flush without backhitch (Backhitchless Flush) which is free from a problem of a gap between data written in a previous synchronization and data written in a subsequent synchronization. Meanwhile, the tape drive recursively performs operations of accumulating data in the buffer and rewriting the data to a normal recording region (normal wrap) when no free space is left in the buffer or the temporary recording region.
As described above, when the synchronization is performed, no backhitch is required for the subsequent synchronization in RABF. Thus, time required for the synchronization can be reduced. Drastic improvement in performance can be achieved particularly when the frequency of synchronization requests is high as compared with a data amount.
Meanwhile, in a tape drive for enterprise (such as IBM® model 3592) and a tape drive in compliance with LTO standards, variable-length data sent from a host is reorganized to be written to a tape medium in units of datasets having a fixed length. Generally, in this case, the data is simultaneously written to multiple tracks (for example, 16 tracks). In other words, 1/16, for example, of a content of the dataset is written to each track. Furthermore, the dataset includes an error correction code and the like in addition to the data sent from the host. A product code of a C1 code and a C2 code is used as the error correction code. The C1 code is for correcting random error in units of bytes in writing and reading, whereas the C2 code is for correcting burst error which is caused by write elements of the tape drive or by a defect on the tape. Moreover, the data is interleaved in the dataset. In other words, the data is arranged in a non-contiguous way. Thus, even when approximately 20% of the data is lost in any area in the dataset, the entire data can be recovered from the remaining data.
However, recent improvement in a recording density has reduced a recording length per unit data, which makes a situation more frequently occur where a part of the data of the dataset cannot be read due to a defect or dust on the tape medium. As a countermeasure to prevent a situation where unreadable data fails to be recovered even by use of the error correction code, the capacity of the dataset tends to increase from one generation to the next.
Meanwhile, there is a known technique for transferring only part of sub-units having data corrected for error when returning the corrected data from error correction means to an external buffer. However, as the capacity of the dataset becomes larger as described above, the time of writing the dataset becomes a bottleneck in RABF if a transaction size, that is, a data amount written from the host in a period between two consecutive synchronizations is smaller than the size of the dataset. Specifically, it is known that more than half of the time required to write the dataset in RABF is purely proportional to the number of bits included in the dataset. For example, if the transaction size is 4 KB, a whole dataset having the rest of the bytes padded is written to the tape medium. In other words, in a generation of a 403 KB dataset size, writing data of 403 KB is sufficient. However, in a generation of a 1.6 MB dataset size, data of 1.6 MB is to be written to the tape medium. Accordingly, there is a problem that the generation of the 1.6 MB dataset size cannot achieve performance equal to that in the generation of the 403 KB dataset size. Such problem may occur not only in RABF but also in normal writing. There is currently no effective means to address this problem.