1. Field of the Invention
This invention relates in general to data storage and processing, and more particularly a method for recovering data from a damaged tape drive.
2. Description of Related Art
In hierarchical storage systems, intensively used and frequently accessed data is stored in fast but expensive memory. One example of a fast memory is a direct access storage device (DASD). In contrast, less frequently accessed data is stored in less expensive but slower memory. Examples of slower memory are tape drives and disk drive arrays. The goal of the hierarchy is to obtain moderately priced, high-capacity storage while maintaining high-speed access to the stored information.
One such hierarchical storage system is a virtual tape storage system (VTS), including a host data interface, a DASD, and a number of tape devices. When the host writes a logical volume, or a file, to the VTS, the data is stored as a file on the DASD. Although the DASD provides quick access to this data, it will eventually reach full capacity and a backup or secondary storage system will be needed. An IBM 3590 tape cartridge is one example of a tape device that could be used as a backup or secondary storage system.
When the DASD fills to a predetermined threshold, the logical volume data for a selected logical volume, typically the oldest, is removed from the DASD to free space for more logical volumes. The selected DASD file is then appended onto a tape cartridge, or a physical volume, with the original left on the DASD for possible cache hits. When a DASD file has been appended to a tape cartridge and the original remains on the DASD, the file is "premigrated".
When the host reads a logical volume from the VTS, a cache hit occurs if the logical volume currently resides on the DASD. If the logical volume is not on the DASD, the storage manager determines which of the physical tape volumes contains the logical volume. The corresponding physical volume is then mounted on one of the tape devices, and the data for the logical volume is transferred back to the DASD from the tape.
The VTS uses a storage manager to transfer data between the DASD and storage tapes. The storage manager is controlled by an automated administrator program. In order for the VTS to appear as a black box to the customer, it may run unattended for months. One of the most important requirements of a VTS is to ensure integrity and safety of data. When a physical tape volume exhibits a tape error or a permanent write error, error recovery is first attempted. If error recovery efforts fail, it is a good indication that something is wrong with either the tape drive or the tape. In this situation, the suspect physical tape volume is deemed "read-only" by the storage manager. The tape volume is assumed to be potentially degraded or at least not having the same long-term reliability as a normal tape. Once a physical tape volume is considered suspect, the only way to ensure long-term reliability of the data already written to the tape is to read it from the tape and store it on another tape.
In this situation, the automated administrator may instruct the storage manager to copy all volumes on the suspect tape to another tape. However, some of the volumes on the suspect tape may no longer be the current version of the logical volume. If all volumes on the suspect tape were copied, substantial duplication of storage would result. Newer or identical versions of those files may exist in other locations, or may still be resident on the DASD.
Another approach to prevent the loss of data from an unreliable tape is using backup systems. One known computer backup system periodically copies volumes from the tape device to a backup tape device or other storage device. In a full backup, all files of the disk are copied to tape. This approach often requires that access to the tape device that is being backed up is not possible until the process is complete. In an incremental backup, only tape volumes that have changed since the previous backup are copied.
Then, if a tape becomes unreliable, the last version of each volume that was backed-up to another storage device can be restored by mounting the storage device and copying the backup device's content to a new tape. However, one problem with periodic backups is that an error may occur after a new version has been written to the unreliable tape, but before a backup process has taken place.
Another data protection scheme maintains a redundant set of the logical volumes that are stored on each tape. Each time a volume is written from the DASD to a tape device, a second copy of the volume could be mirrored on another tape. Then, if one tape device becomes unreliable, the storage manager may determine a list of volumes that were on the unreliable tape device, and then determine the location of the mirror copy corresponding to each of the volumes on the unreliable tape.
Full copying of an unreliable tape, backup methods, and redundant storage methods all employ considerable space on tape devices to carry out. Methods for protecting volume on tape devices are needed that do not use excessive amounts of storage space or require additional tape devices on the VTS.
It can be seen that there is a need for automated read-only volume recovery that takes into account the fact that files may be stored in more than one location within the virtual tape system so that files are not copied unnecessarily.
It can also be seen that there is a need for a system that determines the state of the file being recovered and how the recovery is performed, in addition to removing all data from the suspect tape and then ensuring that it is ejected from the VTS, so that data recovery is accomplished with minimal human intervention.