1. Field of the Description
The present description relates to magnetic tape data storage and, in particular, to methods and systems for monitoring operation of a tape-based data storage system including gathering data from tape libraries and including determining and predicting health of media and tape drives and other storage system components.
2. Relevant Background
For decades, magnetic tape data storage has offered cost and storage density advantages over many other data storage technologies including disk storage. A typical medium to large-sized data center will deploy both tape and disk storage to complement each other and with the tape storage often used for backup and archival data storage. Due to the increased need for securely storing data for long periods of time and due to the low cost of tape, it is likely that tape-based data storage will continue to be utilized and its use will only expand for the foreseeable future.
Briefly, magnetic tape data storage uses digital recording on to magnetic tape to store digital information, and the tape is packaged in cartridges and cassettes (i.e., the storage media or simply “media”). The device that performs writing and reading of data is a tape drive, and tape drives are often installed within robotic tape libraries, which may be quite large and hold thousands of cartridges to provide a tremendous amount of data storage (e.g., each tape may hold several terabytes of uncompressed data).
An ongoing challenge, though, for the data storage industry is how to manage and monitor data centers, and, particularly, how to better monitor tape storage media and devices. For example, customers demand that data be kept safely and with lower tape administration costs. In this regard, the customers desire solutions that efficiently and proactively manage data center tape operations including solutions that provide failure analysis for problematic or suspect media and drives. Further, customers demand data collection regarding operations to be non-invasive, and the management solution should provide recommended corrective actions. Data storage customers also want their investment in tape technologies to be preserved and data integrity maintained. This may involve monitoring tape capacities in volumes and/or libraries, flagging media to be migrated, and advising on resource rebalancing. Customers also desire a management solution that provides an effective and useful user interface to the collected tape operations data and reporting of detected problems or issues.
Unfortunately, existing tape data storage management solutions and systems have not met all of these needs or even fully addressed customer dissatisfiers. For example, existing management tools typically only collect and report historical data, and it can be very difficult after the fact or a problem with tape operations occurs to determine whether a particular drive or piece of media was the cause of a failure. This can lead to cartridges or other media being needlessly replaced or a tape drive being removed for repair or even replaced with verification of a fault. Some systems manage media lifecycles, but this typically only involves tracking the age or overall use of media to provide warnings when a tape or other media is potentially nearing the end of its useful life to allow a customer to remove the media. Existing systems also often only provide alerts after a failure or problem has occurred, e.g., alert when already in a crisis mode of operation. Further, reporting is limited to predefined reports that make assumptions regarding what information likely will be important to a customer and provide the customer with no or little ability to design a report or select data provided to them by the tape operations management system.
The data storage industry's current tape monitoring approach may be categorized as falling within one of three categories, with each having issues or problems limiting their widespread use or adoption. First, tape monitoring may involve a datapath breach approach. Such an approach only works in a storage area network (SAN) environment and it also introduces drive availability risk and exposes data to vendors. Second, tape monitoring may involve a media vendor-lock in approach, but this results in reporting only being available if the media in a data center or tape library was sourced from a particular vendor. Third, tape monitoring may be limited to a single library within a data center, and this may be undesirable as each library has to launch its own monitoring application and the data is not aggregated for analysis or for reporting to the customer or operator of the data center.
Hence, there remains a need for improved systems and methods (e.g., software products) for providing customers with timely information to efficiently manage data center tape operations. Preferably, the information would include tape analytics that would allow proactive management of the tape operations rather than merely reactive management based on vendor-selected sets of historic data.