Automated data storage libraries are known for providing cost effective storage and retrieval of large quantities of data. The data in automated data storage libraries are stored on data storage media that are, in turn, stored on storage shelves or the like inside the library in a fashion that renders the media, and its resident data, accessible for physical retrieval. Such media is commonly termed “removable media.” Data storage media may comprise any type of media on which data may be stored and which may serve as removable media, including but not limited to magnetic media (such as magnetic tape or disks), optical media (such as optical tape or disks), electronic media (such as PROM, EEPROM, flash PROM, Compactflash™, Smartmedia™, Memory Stick™, etc.), or other suitable media. Typically, the data stored in automated data storage libraries is resident on data storage media that is contained within a cartridge and referred to as a data storage media cartridge. An example of a data storage media cartridge that is widely employed in automated data storage libraries for mass data storage is a magnetic tape cartridge.
In addition to data storage media, an automated data storage library also typically contains data storage drives that store data to, and/or retrieve data from, the data storage media. The transport of data storage media between data storage shelves and data storage drives is typically accomplished by one or more robot accessors (hereinafter termed “accessors”). Such accessors have grippers for physically retrieving the selected data storage media from the storage shelves within the automated data storage library and transport such media to the data storage drives by moving in the X and Y directions.
It is often desirable to allow for expansion of an automated data storage library by including additional storage shelves and data storage drives and additional accessors. As an example, an IBM® 3494 Tape Library Dataserver1 automated data storage library is scaleable from a single base frame to a base frame with fifteen extension frames. The use of multiple accessors in an expanded automated data storage library can measurably improve the performance of the library, and as such, the IBM 3494 library may employ two accessors. Typically, both accessors travel on similar paths alongside the storage shelves and the data storage drives, with the paths being either common paths or independent paths, or a combination thereof. The coordination of multiple accessors in a single library to improve library availability can be accomplished in various ways. One such means of avoiding library outages by coordinating the activity of multiple accessors is termed “hot-standby” mode and involves the designation of one accessor as active and the other as inactive, and the operation of only the active accessor. Thus, the inactive accessor serves as a backup in case the active accessor fails or is taken out of service. An alternative means of providing additional availability between accessors is termed “dual-active” mode and involves the division of the physical library into zones of storage shelves and data storage drives and the separate operation of the accessors to access data storage media in the respective zones. Dual-active mode offers the advantage of potentially improving the overall performance of the library, since the work is shared between two or more accessors. 1IBM is a registered trademark of International Business Machines Corporation in the United States, other countries, or both.
Typically, a host system, such as a host server, communicates with the library directly or through one or more data storage drives, providing commands to the library to access particular data storage media and to move the media between the storage shelves and the data storage drives. The term “work requests” is used herein to refer to commands provided by the host system to the library to so access and move media. The work requests may be logical commands identifying the media and/or logical or physical locations for accessing the media.
A library typically employs one or more controllers, each with one or more processors for receiving the commands (i.e., work requests) and establishing a work queue for the library. The work queue holds one or more commands currently being executed, and may hold additional commands that are waiting to be executed. As the work queue is processed, the controller converts the commands to physical movements of the accessor, and transmits signals for operating servo motors, thereby directing the operation of the accessor(s). Accordingly, this controller is referred to herein as an “accessor controller”. An accessor controller may be dedicated to a particular accessor or it may direct or control more than one accessor.
If multiple accessor controllers are employed, and in order for an accessor to serve as a back-up to another accessor in either hot standby or dual active modes, each such accessor controller must possess information regarding the work queue of the other accessor controller(s) in order to assume the outstanding work requests in the event one accessor controller fails, whether the controller itself fails or its accessor fails. One means for providing each accessor controller a copy of the work queue of the other accessor controller, including status information regarding which commands have been received, started and completed by the other accessor controller, is described in commonly-assigned U.S. Pat. No. 6,356,801 “High Availability Work Queuing in an Automated Data Storage Library,” which is incorporated herein for its showing of controlling two accessors by synchronizing their work queues. By synchronizing work queues between accessor controllers, each accessor controller is kept apprised of outstanding work requests for the library and can serve as a back-up in the event of the failure of an accessor controller, thereby preventing the loss of work requests.
Regardless of whether a library contains a single accessor controller or multiple accessor controllers, each such controller and its accessor and any associated control lines comprise single points of failure, if the work queues are not synchronized. Any related failure would render a work request for data unfilled or would potentially unacceptably delay access to requested data in the library. More particularly, if a library employs only a single accessor controller, whether to control a single or multiple accessors, the failure of that single accessor controller may result in the loss of the work requests in its queue. Similarly, if a library employs multiple accessor controllers that have independent, non-synchronized work queues, the failure of such controllers also may result in the lost of queued work requests. While synchronizing work queues across multiple accessor controllers offers a solution to loss of work requests from a single point of failure, the effectuation of such synchronization can be complex and can be costly in terms of time required to maintain the synchronized state.
Typically, data stored on data storage media of an automated data storage library, once requested, is needed quickly. Thus, it is desirable that an automated data storage library be maintained in an operational condition as much as possible, such as the well known “24×7×365” availability. In order to achieve and maintain this high availability of data from a library, a need remains to eliminate or reduce the single point of failure that is presently resident at the accessor controller level, as well as to improve the efficiency by which such availability is maintained.