The present invention is directed to a disk based backup storage system that can be seamlessly integrated with a tape backup system or the like and, more specifically, to a method of importing data into a virtual tape library.
Backing up computer data, restoring computer data, securing computer data and managing computer data storage (collectively referred to as data protection) requires complex and disparate technical and operational solutions. Data protection is the single most expensive storage administrative task.
One data protection strategy is to use a redundant array of independent disks (RAID) and disk mirroring technology to protect data. Unfortunately, disk mirroring only prevents data loss in the event of a hardware or power failure. Mirroring does not protect data from human error, such as the accidental deletion of portions of a document. On a disk mirrored system, once data has been deleted from the primary disk, the data is automatically deleted from the mirrored disk and is not retrievable.
To address the problem of human error and computer viral damage, backup systems have been designed that are file-based and track files for many generations. One typical form of data protection backup uses physical tapes to store data in tape libraries. Physical tape backup libraries provide the ability to restore current and historical data and to recover from a variety of forms of data loss.
Referring to FIG. 3, a typical physical tape library 12 is shown. Tape cartridge slots 14 provide storage slots for physical tapes 13. This physical tape library has 40 slots 14 with some of the slots 14 shown containing physical tapes 13. Four tape drives 15 are shown along the bottom of the physical tape library 12 that can be used for reading from, and writing to, the physical tapes 13. Barcode labels 17 are typically used with physical tapes 13 to facilitate automated tape handling and tracking by the data protection application. The physical tapes 13 typically also have a human readable version of the information coded in the barcode to allow manual selection and identification of the physical tapes 13.
A typical physical tape library 12 includes a built in barcode reader which is used to read the barcode labels 17 on the physical tapes 13. Typical data protection applications keep track of data that is backed up on tape 13 by associating the data with a tape 13 having a particular barcode. By including in a barcode reader, the physical tape library can identify a particular physical tape 13.
Physical tape libraries 12 preferably include an entry/exit port 19. The entry/exit port 19 (shown in the upper left hand side of FIG. 3) provides a pathway for tapes 13 to be automatically moved into and out of the physical tape library 12. A tape 13 in the entry/exit port can be accessed by a human operator while the rest of the tapes 13 are secured within the physical library housing. Robotic mechanisms are used to move a tape 13 between the slot 14 and the entry/exit port 19.
To automate the mounting and unmounting of tapes into tape backup drives, many organizations use a robotically-controlled tape library. Actual usage of individual tape media is generally very infrequent. Backup jobs typically run at night during a period called the “backup window”. Typically, organizations use tape rotation schemes whereby the organization writes to daily tapes, weekly tapes and monthly tapes. Many of the tapes are sent off-site after being written to, and are not accessed again until either computer data must be restored or the computer data on the backup tape has expired (usually after some number of weeks, months or even years). Additionally, adding to the size of a tape library can be a complicated matter requiring the integration of additional tape libraries into the data protection application.
An essential component of a virtual tape library is its ability to work with removable data storage media. Removable data storage media is essential for offsite archival of data, for freeing up space in the virtual tape library and for interchanging data.
The importation of data into a virtual tape library from physical data storage devices is necessary for a virtual tape library to operate efficiently. Data frequently needs to be imported from removable data storage devices during the following operations: overwriting of the storage media (usually because the storage time has expired and the storage media is scheduled to be overwritten with new backup data); appending data to a preexisting storage media; reading data from the physical storage media; making the contents of the data storage media more reliably available to the data protection application (e.g., while a physical tape is not fault tolerant, a copy of a physical tape maintained in virtual tape library can be fault tolerant); conversion from legacy physical tapes to virtual tapes in preparation for generating new physical tapes using the same (refresh) or different (conversion) tape media type; optionally marking the data storage media as “obsolete” because a new virtual tape of the same name will be created; and the physical tape or a virtual tape of the same name is being marked as bad or obsolete.
Several problems exist when traditional virtual tape library's import data from a storage media. The trigger signals and workflows for various physical tapes are different. Without a virtual tape library, the normal procedure for triggering the importing of data from physical tapes is for the user to query the data protection application to find out the labels of the tape(s) to be imported. With a traditional virtual tape library, the tapes the data protection application writes to are virtual and have different labels from the physical tapes that are tracked by preexisting data protection applications. The typical import procedure must be carried out via the virtual tape library and not the data protection application. Additionally, a physical tape may contain some virtual tapes that a user does not want to import.
A second problem with traditional virtual tape libraries is that they only allow the importation of proprietary tapes. Traditional virtual tape library systems generate physical tapes that are written in the virtual tape library's proprietary format, which can only be read and interpreted by that particular virtual tape library system. The proprietary format adds an additional step in the data importing process. The data from the tape must first be loaded into the virtual tape library system before being passed to the data protection application. More importantly, the proprietary format can not be read by the data protection application unless the virtual tape library system that originally created the tape is still being used at the time the tape is needed. This can create a problem when a tape is required for the restoration of data many years after the tape was created. Thus, virtual tape libraries that use a proprietary format for creating physical tapes can only import tapes that the virtual tape library has written.
Another problem with traditional virtual tape libraries is that the import of data into a virtual tape library can be slow when the import involves the copying of all the tape's data. This process can also be wasteful if all the data is not actually required by a user (for example when the tape is imported to be overwritten).
Clearly, what is needed is a method of importing data from a physical data storage device into a virtual tape library system that preferably does not require that the physical data storage device use proprietary coding and formatting. I would also be preferable that the method incorporate multiple copy modes such as completely copying the data on the physical device, copying a portion of the data on the physical device prior to overwriting the physical device, and operating in a pass-through mode to allow a data protection application to directly read from and/or write to the physical device. Finally it would be preferable that the method is able to import data into a virtual tape library which can replace a physical tape library or act as a cache for a physical tape library for a preexisting data protection application.