Commonly assigned U.S. patent application Ser. No. 09/283,223 K. F. Day et al., now U.S. Pat. No. 6,336,173, is incorporated for its showing of a data storage library system having directors for storing and tracking multiple copies of data in system data storage libraries.
Commonly assigned U.S. patent application Ser. No. 09/322,010, K. G. Dahman et al., is incorporated for its showing of an indirect means of communicating between directors of the Day III et al. data storage library system.
This invention relates to storage of redundant data in a plurality of data storage libraries, the data storage libraries having both cache storage and backing storage, and, more particularly, to the migration of the data from cache storage to backing storage.
Data processing systems comprising at least one host typically require a large amount of data storage. If the data, typically stored as a data volume, is not immediately required by the hosts, for example, if the data volume is infrequently accessed, the storage of the data volume may be on removable rewritable data storage media, such as magnetic tape or optical disk, and the data volumes may be written and or read by means of a data storage drive.
The data storage drive is typically coupled to the host, or processing unit, by means of a peripheral interface in which commands are directed only from the processing unit to the data storage drive, and the data storage drive responds to those commands, performing the commanded functions. No commands can be sent by the data storage drive to the coupled processing unit. Typically, the commands are performed by a device controller.
If a large amount of data is to be stored and accessed on occasion, data storage libraries are employed. Such data storage libraries typically provide efficient access to large quantities of data volumes stored in a backing storage of removable data storage media, the media stored in storage shelves which are accessed by robots under the control of robot controllers. Due to the large amount of stored data, typically, a plurality of hosts make use of the same data storage library, and a plurality of data storage drives are included in the library to allow access by the hosts. A library manager, which may comprise the same processor as the robot controller, typically tracks each data volume and the removable data storage media on which it is stored, and tracks the storage shelf location of each data storage media. Herein, a library manager, either with or without the robot controller, is defined as a xe2x80x9ccontrollerxe2x80x9d for the data storage library, as is the xe2x80x9ccontrollerxe2x80x9d for a data storage device as discussed above.
If the data storage media, when accessed, may be reaccessed, it is advantageous to employ data storage libraries having both cache storage and backing storage. The data storage library will access the data volume of the removable media from the backing storage and will temporarily store the data volume in the cache storage so that it can be immediately reaccessed. The removable media may then be returned to a storage shelf, and the data volume updated while it is in cache storage without the need to reaccess the removable media. The cache storage is typically limited in capacity, requiring that the data volumes be migrated to backing storage so as to free space in the cache storage. Typically, a least recently used (LRU) algorithm is employed to migrate data volumes out of cache storage to backing storage.
It is also desirable to provide a level of redundancy of the data to provide constant access to data volumes, in the event a data storage library or a communication path to a data storage library becomes unavailable.
An example of a data storage library system for redundantly storing and accessing data volumes stored on removable data storage media in a plurality of data storage libraries is described in the incorporated coassigned K. F. Day III et al application. The library controller of each library provides an updatable synchronization token directly associated with each data volume. A plurality of directors are provided, each separate from and coupled to the hosts and each separate from and coupled to each data storage library. Each director responds to separate, partitioned data storage drive addresses addressed by the hosts. The responding director supplies each data volume supplied from a host to all of the data storage libraries, and updates each synchronization token directly associated with the supplied data volume. Thus, the directors store duplicate copies of the data volume in the data storage libraries without involvement by the host. In most data processing applications, it is critical to access the most current data. Hence, the currency of the data volumes are each tracked by means of the directly associated synchronization token, and the synchronization token is tracked by the directors.
The redundant copies of the data volumes are handled identically by the data storage libraries, and the data volumes not recently accessed, if in cache storage, are likely to be migrated to backing storage in similar fashion. Thus, the content of the cache storage of each data storage library will be similar, with some difference due to different rates of migration.
It is an object of the present invention to increase the availability of cache storage in ones of a plurality of data storage libraries which store redundant copies of data volumes.
Disclosed are a data storage library, and a method which may be implemented in a programmable computer processor by a computer program product, for increasing the availability of cache storage for storing redundant copies of identifiable data volumes in ones of a plurality of data storage libraries. The data storage libraries have both cache storage and backing storage. Each of the identifiable data volumes is directly associated with an updatable synchronization token, the synchronization token indicating the relative update levels of the directly associated redundant copies. The data storage libraries are coupled to a plurality of directors. A data storage library migrates data volumes from the cache storage to the backing storage in a predetermined sequence. The data storage library maintains the synchronization token directly associated with the data volume. Cache storage is made available by migrating all but one of the redundant copies of the data volume to backing storage on a high priority basis.
Specifically, in response to a selection of the data storage library as a primary data storage library for redundant data volumes having identically updated synchronization tokens, the data storage library places the data volume in the cache storage at a low priority of the predetermined sequence, so that the data volume is maintained in cache storage and is migrated only on a low priority basis.
In response to a selection of the data storage library as a secondary data storage library for redundant data volumes having identically updated synchronization tokens, the data storage library places the data volume in the cache storage at a high priority of the predetermined sequence for migrating the data volume to the backing store on the high priority basis of the predetermined sequence, the migration freeing and making available a portion of the cache storage.
Thus, the cache storage of a secondary library is likely to be freed and made available for additional data volumes, while only one of the libraries maintains the data volume in its cache storage on the low priority migration basis.
The primary library and the secondary library(s) may be selected by either the director or by the libraries, in a rotating round robin sequence, or selecting the library which was the source of the data volume as the primary, or selecting the library having the least load as the primary.
If the predetermined sequence for migration is a sequence of data volume storage time stamps, the high priority basis may comprise an artificial early time stamp. If the predetermined sequence is a migration LRU, the high priority may comprise reordering the queue to place the data volume identifier at the LRU extreme of the queue. Alternatively, a separate priority flag may identify the data volume for high priority migration.
For a fuller understanding of the present invention, reference should be made to the following detailed description taken in conjunction with the accompanying drawings.