This invention relates to storage of data on rewritable data storage media which is accessible in data storage libraries, and, more particularly, to providing access by at least one host to multiple copies of data volumes stored in a plurality of data storage libraries.
Data processing systems comprising at least one host typically require a large amount of data storage. If the data, typically stored as a data volume, is not immediately required by the hosts, for example if the data volume is infrequently accessed, the storage of the data volume may be on removable rewritable data storage media, such as magnetic tape or optical disk. Data storage libraries typically provide efficient access to large quantities of data volumes stored in removable data storage media, the media stored in storage shelves which are accessed by robots under the control of robot controllers. Due to the large amount of stored data, typically, a plurality of hosts make use of the same data storage library, and a plurality of data storage drives are included in the library to allow access by the hosts. A library manager, which may comprise the same processor as the robot controller, typically tracks each data volume and the data storage media on which it is stored, and tracks the storage shelf location of each data storage media.
Herein, a library manager, either with or without the robot controller, is defined as a xe2x80x9clibrary controllerxe2x80x9d.
Because access to the data volumes would be prohibited if the robot were to fail, many data storage libraries have dual robots. Also, such libraries often are equipped with dual power supplies to provide a level of redundancy in case of failure of one of the power supplies. Further, dual library controllers may be used, each operating one of the robots. Coassigned U.S. patent application Ser. No. 08/961,135, now U.S. Pat. No. 5,914,919, issued to Fosler et al., provides dual library controllers and dual robots and, upon the failure of one robot, quickly and automatically switches the active one of the library controllers to operate the second robot.
The dual robots must each use a common track or rail to access the storage shelves of the data storage library. If a failure causes the common track or rail to become unusable, for example, if a robot became stuck, the library would be unusable. A communication link between the host and library may fail, losing access to the data volumes. Similarly, if the entire library were to fail, for example, by a failure of the power connection to the library, the access to the data volumes would be prohibited until repairs were completed.
Individual data storage drives not in a library, but with human operators, would be able have the operator hand carry a removable data storage media from a failing drive to another drive which is coupled to the same host. However, if the only library failed, no alternative drive would be available for mounting the removable data storage media, and physical access to the media may be difficult. Further, if the library is a xe2x80x9cVirtualxe2x80x9d library, temporarily storing data in memory or non-volatile cache before storing it in the removable data storage media, the temporarily stored data cannot be transferred from a failed library.
Duplicate libraries may be envisioned, but the hosts would have to separately provide the data volumes to each of the libraries and provide a tracking database, dramatically reducing efficiency. Perhaps only the more important data volumes would be duplicated, but each host would have to track the individual location of each data volume that was not duplicated, and track the data volumes which were duplicated.
It is an object of the present invention to provide dual data storage libraries and storage and tracking of data stored in the dual data storage libraries which is transparent to the hosts.
Disclosed are a data storage library system and a method for redundantly storing and accessing identifiable data volumes. A plurality of data storage libraries, each having a library controller, a storage interface, rewritable data storage media, and at least one data storage drive for reading and/or writing on the data storage media. The data volumes are transferred, under the control of the library controller, between the storage interface and the data storage drive. The library controller provides a synchronization token directly associated with each data volume, the synchronization token comprising an updatable token.
A plurality of directors are provided, each separate from and coupled to the hosts and each separate from and coupled to each data storage library. A director is a data processor with interfaces, such as ESCON or SCSI, appropriate to the connections to the hosts and to coupled data storage libraries, but without a display, and comprises, for example, an IBM RS-6000 processor. Each director receives commands relating to identifiable data volumes, and each director responds to separate, partitioned access addresses addressed by the hosts. The responding director additionally responds to any accompanying data volume supplied by the addressing host, in turn supplying the command and accompanying data volume to all of the plurality of data storage libraries, and the responding director updates each synchronization token directly associated with the supplied data volume.
The synchronization tokens may comprise incrementable integers, which are updated by the responding director by incrementing each synchronization token directly associated with the supplied data volume. The responding director may increment each synchronization token directly associated with the same supplied data volume to the same integer value. The director may determine the integer value by comparing the previous integer value of each synchronization token directly associated with the supplied data volume, and setting the synchronization tokens to a value incremented beyond the most current integer value indicated by the comparison.
Thus, in accordance with the present invention, the directors appear to the host as though there is a single library, and the directors have the capability to store duplicate copies of the data volume in the data storage libraries without involvement by the host. The currency of the data volumes are each tracked by means of the synchronization token, and the synchronization token is directly associated with the data volume, and is not tracked by the host and does not require a central tracking database.
Further, should one library become unavailable, the responding director may access the data volume at another of the libraries without involvement by the host. The director may update the data volume and the synchronization token at the other library, and, when the failed library becomes available and the data volume again is accessed, the responding director will determine that the synchronization tokens do not match, will provide the most current copy to the host, and will update the data volume that was not current, again without involvement by the host.
The library controller may store the synchronization tokens with the rewritable data storage media storing the data volumes directly associated therewith, or, alternatively, may maintain a table of the synchronization tokens, the table directly associating the synchronization tokens with the data volumes.
The concepts of xe2x80x9cMASTER/SLAVExe2x80x9d or xe2x80x9cPRIMARY/SECONDARYxe2x80x9d may be employed in another aspect of the present invention. One of the plurality of data storage libraries is designated as a xe2x80x9cMASTERxe2x80x9d library and all the other data storage libraries are each designated as a xe2x80x9cSLAVExe2x80x9d library, and the responding director, when addressed by the host access address, supplies a host supplied data volume first to the xe2x80x9cMASTERxe2x80x9d library and second to the xe2x80x9cSLAVExe2x80x9d libraries. The director may copy the data volume from the xe2x80x9cMASTERxe2x80x9d library to the xe2x80x9cSLAVExe2x80x9d libraries, and not require involvement by the host in making the duplicate copies.
The present invention effectively distributes the tracking database to the media or to the libraries actually storing the copies, and does so transparently to the hosts. Thus, there is no requirement for the hosts to provide a single central database at one of the hosts, or at high availability hardware at one of the hosts, nor to provide separate distributed databases at each of the hosts.
The present invention is especially advantageous for tape libraries. Data volumes are provided to the library and the host waits until the tape drive writes the data volumes to the removable tape media, or until a xe2x80x9cvirtualxe2x80x9d library writes the data volumes to non-volatile cache, before providing a xe2x80x9creturnxe2x80x9d signal to the host. With the present invention, the director provides the xe2x80x9creturnxe2x80x9d signal to the host without waiting for all the libraries to respond, in effect, providing buffering and a synchronous overlap, while not requiring a special non-volatile cache.
For a fuller understanding of the present invention, reference should be made to the following detailed description taken in conjunction with the accompanying drawings.