This invention relates to data storage systems and, in particular, to a plurality of tape devices and automated tape libraries where the tape devices are connected to a plurality of data processors via a high speed network and which along with a Network Storage Manager system controller collectively implement a virtual, distributed data storage subsystem. This readily scalable data storage subsystem creates multiple virtual storage devices on demand, which are available to any and all of the system""s data processors.
Problem
It is a problem in the field of data storage subsystems to provide their attached data processors with economical data storage services at adequate performance levels. Although storage device technology can support the development and implementation of specialized devices with uniquely high performance levels, such devices are generally not economically feasible. Commercial, off-the-shelf storage devices are typically the most economical building blocks for data storage subsystems but these devices can impose performance limitations; applications requiring performance levels beyond the storage device""s native characteristics require the use of architectural approaches.
One other aspect of this data storage problem is that the allocation of customer data to a single type of data storage device represents a limitation when faced with widely varying data storage needs. This limitation can be partly obviated by sharing input/output (I/O) activity and data bandwidth across a plurality of data storage devices of a data storage subsystem. However, a fixed configuration of data storage devices also limits the scalability in performance and provides no facility for applications to request changes in data storage performance. An architecture where the data storage devices are located behind a server(s) further limits the delivered performance since the data storage bandwidth is limited by the server itself. Therefore, architecting a data storage subsystem that can efficiently serve the needs of the applications extant on the data processors is a daunting problem. There are numerous factors that affect performance; this problem is particularly pertinent to tape devices, since the tape devices must serve a wide range of performance requirements and it would be desirable for a single tape device type to serve this wide range of needs.
In FIG. 1, a traditional xe2x80x9cdirect attachxe2x80x9d tape configuration is shown. Here tape devices TD1, TD2, TD3 are each directly connected to single data processors DP1, DP2, DP3 on a control network CN in a dedicated tape device configuration. The data processor has exclusive use of the tape device and typically communicates with the tape device via a SCSI or Fibre Channel interface. However, the use of dedicated tape devices is an expensive proposition where there are a plurality of data processors to be served, especially if the data access loads generated by the plurality of data processors are erratic. In this data storage subsystem architecture, the utilization of the tape devices and the efficiency of the data storage function are less than optimal, since each data processor is limited to its dedicated tape device with its physical constraints.
FIG. 1 also includes an automated tape cartridge library system (ACS) and an automated cartridge controller (ACSC) which provides enhanced response time for the tape devices by mounting/dismounting their tape cartridges. However, the tape devices may have a SCSI interface in the data path that introduces a number of physical limitations to the operation of the system. The first limitation is that only a small number of tape devices can be attached to a SCSI bus compared to other bus architectures. The second limitation is the limited bandwidth of the SCSI bus that is shared by these tape devices. The length of the SCSI cable also represents a limitation, since it is typically restricted to 25 meters.
An alternative data storage subsystem architecture is shown in FIG. 2. Here, a traditional xe2x80x9cclient/serverxe2x80x9d network attached tape configuration includes a plurality of tape devices TD1, TD2, TD3 each attached to the data communication network N via dedicated servers TS1, TS2, TS3; data processors DP1, DP2, DP3 connect to the network directly. In this architecture, the data processors all have access to all of the tape devices via their servers. The data processors run tape server software to manage the access protocol for the plurality of tape servers. Though this configuration allows tape devices to be allocated dynamically, it does not scale very well with the demand on the data transfer rate because the bandwidth is limited by the server.
A variation of this storage architecture is the use of a plurality of tape devices configured into an array. As shown in FIG. 3, the tape devices are configured in a Redundant Array of Independent Tapes (RAIT) in a manner analogous to the Redundant Array of Independent Disks (RAID) which is a well known architecture in the disk device technology. Using a RAIT controller (RC), the tape array is typically located behind a server(s) RS1, RS2, which is directly connected to the network N The bandwidth for data transfers between the data processors and the tape array is not scalable and is also limited by the characteristics of the server(s). The tape array itself is also not scalable or easily reconfigured due to the limitations of the server(s).
These various tape data storage subsystem architectures are all limited in their data exchange I/O activity and bandwidth to that of a single tape device or a fixed RAIT device. The alternative is to use the integration of a plurality of data storage media types (e.g. solid state memory, disk, tape, etc.) and data storage subsystem architectures into a single data storage subsystem. However, this results in increased storage system complexity and costs.
Solution
The above-described problems are solved and a technical advance achieved in the field by the network attached virtual tape data storage subsystem of the present invention. This invention relates to tape data storage systems and, in particular, to a plurality of tape devices and automated tape libraries where the tape devices are connected to a plurality of data processors via a high bandwidth network and which, along with a Network Storage Manager (NSM) controller, collectively implement a virtual, distributed tape data storage subsystem. The virtual, distributed tape data storage system creates multiple virtual tape devices on demand by allocating, configuring and managing the system""s individual physical tape devices. The I/O rate and bandwidth of the individual virtual tape devices are changeable on demand; the aggregate performance of the virtual, distributed tape data storage subsystem is readily scalable by changing the number of physical tape devices and library devices. By pooling the tape devices together, interconnecting them with the data processors via a network, incorporating the Network Storage Manager controller and creating virtual tape devices, the problems of prior art tape device data storage subsystem architectures are overcome. This new architecture can create simultaneous, multiple virtual devices, which are available to any and all of the data processorsxe2x80x94subject only to the limitations of the aggregate capacities and bandwidth of the tape devices, library devices and the network. This architecture enables the storage requirements of the data processors to be met by both individual tape devices or multiple devices in any tape array configurationsxe2x80x94including, but not limited to: RAIT 0, 1, 3, 4, 5. The tape array and data transmission bandwidth can be dynamically reconfigured on demand since the network directly interconnects the tape devices to the data processors.
The virtual tape storage subsystem is controlled by a Network Storage Manager (NSM) which includes a plurality of software elements including: Resource Allocation (RA) software, Resource Configuration (RC) software, Resource Management (RM) software, and Security Management software (SM). The NSM communicates with the Automatic Cartridge System Library Software (ACSLS) via the RA software. The ACSLS communicates with the ACS via network CN. The RA software has the responsibility to keep track of the resource usage such as which data processor presently owns which tape device(s), which tape devices are free to be allocated to the requesting data processors and the like. The RC software configures the data storage resources for the network attached data processorsxe2x80x94responding to manual operator commands, pre-programmed algorithms, rules, application program initiated requests and the like. The RC software assigns the maximum number of tape devices that a data processor can use and designates the configuration of these tape devices. Furthermore, it automatically configures the tape devices allocated to a data processor as well as the connection between the data processor and the tape device(s). The RM software queues the request for the resource allocation and notifies the data processor when the requested resource is ready, or it can schedule the availability of the resources. The ACSLS has the responsibility to keep track of the tape cartridges that are stored in the ACS. It also keeps track of the status of the tape devices, and provides the application programming interface (API) to other application software. The API will allow the application software to load tape or to mount a volume to a tape drive, to unload tape or to unload a volume from a tape drive, to read the status of tape devices, etc.
The use of a networked storage manager enables the dual function of permitting tape devices to be managed as a pool and yet be attached directly to the network as individual resources. In addition, the networked storage manager provides the mechanism for the central control and management of the resources attached to the network such as to control tape device allocation and configuration, as well as other functions, such as tape cartridge movement and data migration. The rules, which could be implemented, address response time constraints, data file transfer size, data file transfer rates, data file size bounds and the like.