1. Field of the Invention.
This invention relates in general to data storage and processing, and more particularly to distributed storage servers.
2. Description of Related Art.
The implementation of new technology in magnetic tape products has meant that the density of data written to tape has increased by orders of magnitude in the last ten or fifteen years. The ability to record high density tapes, e.g., ten gigabytes or more on one physical volume, has led to reducing costs in physical tape hardware as well as in handling and management resources.
However, over the past five years, tape data set stacking products, i.e., software solutions to increase tape utilization, have evolved in response to the customer requirement for more efficient ways to manage the information stored on tape. To achieve increased capacity enablement, a virtual tape server (VTS) has been proposed. In a VTS, the hardware is transparent to the host and the user. The VTS requires little external management except though the library management element of the tape library into which a VTS is integrated.
In a hierarchical storage systems, such as a VTS, intensively used and frequently accessed data is stored in fast but expensive memory. One example of a fast memory is a direct access storage device (DASD). In contrast, less frequently accessed data is stored in less expensive but slower memory. Examples of slower memory are tape drives and disk drive arrays. The goal of the hierarchy is to obtain moderately priced, high-capacity storage while maintaining high-speed access to the stored information.
In the VTS system, a host data interface, a DASD file buffer, and a number of tape devices are provided. When the host writes a logical volume, or a file, to the VTS, the data is stored as a resident file on the DASD. Although the DASD provides quick access to this data, it will eventually reach full capacity and a backup or secondary storage system will be needed. An IBM 3590 tape cartridge is one example of a tape device that could be used as a backup or secondary storage system.
When the DASD fills to a predetermined threshold, the logical volume data for a selected logical volume, typically the oldest, is removed from the DASD to free space for more logical volumes. If the selected logical volume has not already been appended to a tape cartridge or a physical volume, it is appended to a tape cartridge prior being removed from the DASD. A file that has been appended to a tape and removed from the DASD is "migrated." Optionally, any time prior to being removed from the DASD, a DASD file can be appended onto a tape cartridge with the original left on the DASD for possible cache hits. A file that has been appended to a tape cartridge while its original is left on the DASD is said to be premigrated.
When the host reads a logical volume from the VTS, a cache hit occurs if the logical volume currently resides on the DASD. If the logical volume is not on the DASD, the storage manager determines which of the physical tape volumes contains the logical volume. The corresponding physical volume is then mounted on one of the tape devices, and the data for the logical volume is transferred back to the DASD from the tape (recall).
Tape servers may use an engine to move data between the DASD and tape drives in a virtual tape server (VTS) environment. For example, the IBM Virtual Tape Server (VTS) uses the IBM Adstar Distributed Storage Manager (ADSM) as its engine to move data between the DASD and IBM 3590 tape drives on the VTS. In such a system, the VTS uses the storage manager client on the DASD, e.g., the ADSM Hierarchical Storage Manager (HSM) client, and a distributed storage manager server attached to the tape drives to provide this function.
Since recalls take a long time relative to "cache hits," it would be preferably to have as many logical volumes as possible be cache hits. In order to accomplish this a logical volume caching method is used.
Typically the logical volumes in the cache are managed on an FIFO (first in first out) or LRU (least recently used) algorithm. However, each of these methods exhibits one or more disadvantages: the methods do not discern patterns, the methods are not adaptive, or the methods do not improve upon the cache hit rate.
It can be seen that there is a need for a method and apparatus for improving caching for a virtual tape server.
It can also be seen that there is a need for a method and apparatus for improving caching for a virtual tape server which makes assumptions to increase cache hits, but which does not under-perform a LRU algorithm when these assumptions prove to be incorrect.