It is known in the prior art to provide a data storage device externally to, or integrated within a host computer device, for the purpose of backing up data and systems stored on the host computer. Typically, data is sent from a host computer device to a tape drive unit, and the data is stored to a tape data storage medium provided in a cartridge which is removable from the tape drive unit.
Referring to FIG. 1 herein, a known tape drive device 100 receives data from a host computer device 101, which may be networked to a plurality of other computers.
In general, data transferred from a host computer to a tape drive unit is ‘bursty’ that is, it is transmitted in chunks of data, followed by periods of no data. The data chunks are in general of variable and unpredictable length. Since the tape drive contains a tape transport mechanism which is electro-mechanical, and involves a tape data storage medium travelling past a read/write head, stopping and starting of the tape drive mechanism is best minimised or avoided for the following reasons.
Firstly, excessive stopping and starting of the tape drive mechanism reduces the reliability of the mechanism over time.
Secondly, stopping and starting of the tape drive mechanism requires re-positioning of the tape relative to the read/write head, which is time consuming, and therefore reduces the rate at which data can be written to the tape data storage medium, particularly for linear tape drives.
In order to achieve optimum data throughput from the tape drive mechanism, the tape must be kept ‘streaming’, that is kept moving past the read/write head. To keep the tape data storage medium streaming past the read/write head, bursty data received from the host computer device is read into a buffer, which temporarily stores the data, removing some of the burstiness of the data. Continuous data exits the buffer at a more constant data rate determined by the rate at which the data can be written to the tape. In prior art tape drive devices, the rate determining step in writing data to a data storage medium is the relatively low rate at which data can be written from a write head to the tape. Although there is a problem of keeping the tape streaming when there are long periods of no data arriving from the host, the existence of buffers helps to isolate the process of writing data to tape from the burstiness and drop outs in the incoming data stream from the host. However, if the average data rate received from the host computer drops below a rate at which the data continuously exits the buffer and is written to tape, then the buffer empties, and the tape must be stopped, repositioned to a position prior to a last data written, and then restarted once more data is available to fill the buffer of the tape drive unit.
To address the problem of tape stoppage and repositioning, there have been prior art systems developed which vary the speed of a tape past a write head, and thereby allow the tape to maintain streaming for a longer time period, without stoppage.
Commonly assigned U.S. Pat. No. 6,122,124, incorporated by reference herein, discloses an adaptive tape speed method, in which the problem of tape stoppages is alleviated by keeping the tape moving past a write head at a reduced tape speed to match the incoming data rate, thus giving a slower data rate but with the advantage of maintaining streaming of the tape device.
However, in this prior art adaptive tape speed method, drive electronics limitations dictate a limited range of operation for the speed of the tape, which may not be sufficient to accommodate the full range in variations of data arriving from the host. Data arriving from the host may have variations in data rate which exceed the range of write data rates which correspond to the speeds available, and at which data can be written to tape.
Referring to FIG. 2 herein, there is illustrated schematically a prior art method of controlling tape speed by measuring buffer occupancy. A buffer device of a tape drive unit receives input data from a host computer, and produces an exit data stream which is output to a tape write head mechanism. The buffer has an occupancy level 201 of data stored in the buffer of between 0% and 100% of the full data storage capacity of the buffer. Depending upon the data rate of bursty data received from the host device in relation to the rate at which data exits the buffer, the occupancy level of the buffer can vary between 0% and 100%. Data arrives from the input host in bursts, fills up the buffer, and is output to the tape at a nominally constant data rate, which is interrupted when the buffer becomes empty. Interruptions of the exit stream of data from the buffer cause tape stoppage and re-positioning.
In the prior art adaptive tape speed method, the occupancy level 201 of the buffer is electronically monitored, and used as a control signal for determining tape speed past a write head.
FIG. 3 is an exemplary plot of tape speed past a write head, against time under various burst conditions of data received from a host computer device by a tape drive unit operating according to the known adaptive tape speed method.
Under normal operation, where data is being input from the host and filling up the buffer, and an instantaneous occupancy level of the buffer is above a first pre-determined limit, then the tape speed is controlled to be at its maximum value 300. However, if the data stream received from the host has a drop out of data, then the buffer continues to empty of data, but no further data is received by the buffer and the instantaneous occupancy level falls. When the occupancy level falls to a second pre-determined limit, this triggers a reduction in tape speed to a second level 301. Since the tape drive has write electronics which matches the data rate of data exiting the buffer to the tape speed, there is a corresponding reduction in output data rate from the buffer. This keeps the tape streaming at a lower write data rate, until the buffer fills up again. If the buffer empties even further, then further pre-determined levels trigger further reductions in tape speed 302. If the buffer becomes empty, then the tape must be stopped as indicated by level 303, which incurs the penalty of a time delay in repositioning the tape relative to the write head. Operation of the tape can resume at various tape speed levels, depending upon the amount of data received from the host and the occupancy level of the buffer.
Conventionally, during a data dump, or a data backup operation, host computers have generally been able to provide data to a tape drive unit at a higher average data rate than the data can be directly written to a tape data storage medium even though there may be instances of tape stoppage where bursty data is received from the host.
It is known to compress incoming data received from a host computer device prior to writing the data to a tape data storage medium. Data compression provides two main advantages as follows:
Firstly, it allows a greater amount of information to be stored to a tape storage medium using compression, than storing data directly in an uncompressed state to the tape data storage medium.
Secondly, since the rate of data arriving from the host computer is generally higher than the rate of data which can be written to tape, compression allows a reduction in data rate written to the tape data storage medium, compared to the rate of data arriving from the host computer. The difference in data rate depends on the amount of compression which can be applied to the data. This assists the tape drive in keeping up with writing the data to tape as the data arrives from the host computer.
FIG. 4 is a block diagram of another prior art host computer and tape drive unit, wherein there is a difference in data rate between data transferred from the host computer to the tape drive unit, and data written to tape. Data transferred from an internal data storage device 400 of a host computer 401 is transferred across a connection 402, in this example at a rate of 60 Mbytes/s. The data arrives at the tape drive unit 403, and is received by a data compression engine 404 which compresses the data. Varying compression ratios are achieved depending upon the inherent compressibility of the incoming data. In the example shown, an average compression ratio of 2:1 is achieved, and data is written to the tape data storage medium at a data rate of 30 Mbytes/s.
In prior art host computer and tape drive units, data compression has a beneficial effect in at least partially isolating the data rate of data written to tape, from the bursty data, at a higher data rate, arriving from the host computer.
Some host computer operating systems control the data compression ratio in the back up tape drive device via different device files and can disable compression if required. However, this is only done on a once and for all basis at the start of a data storage session. The compression ratio is not changed during the entire back up, regardless of ongoing performance of a data storage back up operation. The prior art host computer entity which can control compression has no visibility of the suitability of the data rate arriving at the tape drive entity which performs the compression, and so cannot set up the data compression ratio in the most effective way.
There is a general trend to increase the write rate of data written from a write head to a tape in a tape drive unit. As the write data rate from write head to tape increases to approach the data rate of incoming data from the host, the buffer system becomes less effective at isolating the write operation from drop outs in data arriving from the host, causing stoppage occurrences to increase. The stoppage occurrence increases even where prior art methods, such as the adaptive tape speed method, are used. As the write data rate to tape increases towards the data transfer rate from host to the tape drive unit, the problem of stoppages becomes more acute with a higher incidence of tape stoppages occurring.
As tape drives get faster and are capable of writing data to tape at a higher data rate, they do not necessarily represent the rate determining stage in a data storage system when performing data storage operations, for example data back ups. System performance is frequently limited by the ability of a host computer to supply data fast enough to keep the tape drive streaming. If the incoming data rate from a host computer drops below a minimum acceptable data rate, then the tape must be stopped, repositioned prior to the last data written, and then restarted once sufficient data is available from the host computer.
Once the host stops supplying data for an extended period and streaming of the tape stops, a delay of several seconds is incurred whilst the tape repositions itself, which is a far higher delay than a latent delay caused by the host in recommencing supply of bursts of data. Therefore, stoppages in streaming are to be avoided wherever possible, since the stoppages become the rate determining step in transfer of data from the host of the tape data storage medium when they occur.