1. Field of the Invention
The present invention relates to a system that processes and responds to messages generated by a digital data processing machine. More particularly, the invention concerns an apparatus, article of manufacture, and method for receiving both immediate-response and delayed-response messages from a data storage subsystem, selectively routing the messages to specialized expert local facilities (ELFs), performing designated functions at the ELFs, generating an appropriate output message, and transmitting the output message to the data storage subsystem. According to the invention, these messages concern the management of data storage devices such as tape drives in the data storage subsystem.
2. Description of the Related Art
Many different machines require operator supervision. Although automation is increasing, some type of management by an operator is still required for many simple machines such as drill presses, facsimile machines, and sewing machines. With more complicated machines, an operator is even more important, to perform critical and often complicated duties.
For example, most mass data storage systems require some type of operator support. Such data storage systems often store customer data on magnetic tapes, magnetic disk drives, or a combination of the two. These systems need an operator to perform error/exception handling, to backup data, to configure hardware devices, and to perform other functions. Furthermore, in storage libraries employing portable data storage units, such as tapes, operators must manage the media pool. This involves, for instance, supplying blank media, labelling tapes, advising the system when new tapes are introduced, and the like.
Thus, the operator provides the mass storage system with a number of benefits. On the other hand, use of an operator also comes with a number of drawbacks. One drawback, for example, is the cost of paying a highly trained person to monitor the data storage system. It may even be necessary to have an operator at hand twenty-four hours a day in systems that store particularly important data, such as automated teller machines, telephone directory information, internationally accessible data, and the like. In these systems, the cost of the operator can be substantial.
Another drawback of human operators is the potential for human mistakes. And, with human operators comes the possibility of human work scheduling problems. The operator's absence from the data storage system at a critical time may have serious consequences. For example, recovery from certain types of system errors may be impossible without operator intervention, thus rendering the entire storage system inoperative.
As a further example, one particularly important yet difficult operator task involves reconciling tape drive availability in a mass storage system. Many mass storage systems include tape libraries supervised by a library manager that oversees a plurality of tape media and along with multiple tape drives. Frequently, the tape library is managed by a storage controller. The human operator must ensure that the storage controller and tape library both accurately monitor tape drive availability in the tape library. This may be achieved, for example, by sending update messages to the storage controller and tape library, as needed. If a disagreement occurs between storage controller and tape library accounting of the drives' availability, there may be an improper attempt to allocate an already-busy tape drive.
Without careful tracking, the accounting of which drives are busy can differ between the storage controller and the library manager. This can easily occur, since no central authority renders a tape drive "available" or "unavailable" in some systems. A drive can become occupied, or free up, due to actions of the library manager, the storage controller, or even a service technician.
Another important reason to accurately track the number of available drives involves cache management. In some systems, data read from tape at tape drives is read into a cache to expedite access to the data. If all tape drives are reading simultaneously, with none writing, the cache may quickly fill, ultimately overflowing and causing an error. Therefore, it may be desirable to reserve one or more tape drives for writing data out from the cache. The benefits of such tape drive management underscore the need to accurately track tape drive availability, both in the storage controller and also in the library manager. However, as discussed previously, the use of a human operator has a number of limitations.