1. Field of the Invention
The present invention is directed to a secure virtual tape management system with balanced storage and multi-mirror options. In particular, the present invention utilizes multiple storage options for balanced storage based on a number of factors and utilizes multi-mirror options.
2. Prior Art
It is necessary to store and backup data for many mainframe computer installations primarily for the purpose of safekeeping critical information in the event of an unexpected loss of the primary copy. The backups are often remotely stored offsite of the mainframe installation.
The invention involves a distributed storage system including storage servers (hereafter server) able to service a client across a network by exposing typical disk, directory and file input/output operations in addition to control operations such as ascertaining the current workload of the server. The client is generally distributed in a modular form making its capabilities accessible to any general program by way of its inclusion.
Typical disk operations include the ability to enumerate the various storage locations, which might be referred to as disks or mount points, and ascertain their specific criteria such as its available free space. Typical directory operations include the ability to enumerate the files present on a particular mount point and ascertain their size and time of their last modification as well as being able to rename or delete them. Typical file operations include the ability to create, open, read, write, seek, truncate and close a file. Commonly expected file operation options are provided, such as the ability to open a file exclusively and/or open a file in a read-only manner. Further commonly available functions include being able to check for the existence of a file, determine the size of an opened file, determine the remaining space available to an opened file and determine the current file position of an opened file.
The control operations exposed by the server include the ability to collect the current use of the various mount points, the current use of the various network paths and the current list of clients and details regarding their activity. Mount point information includes total capacity along with past and current read and write rates along with the current number of open files being accessed. Network path information includes current send and receive rates. Client information includes identification information such as host name and user name, the name of the file currently opened as well as whether it is opened exclusively and/or in a read-only manner, the client's network address and corresponding server address to which it is connected, the total read from and written to the file, the time of the client's last file access and the rates at which the client is writing to the file, reading from the file, sending to the network or receiving from the network. This information affords a means not only to monitor use and detect problems but also to determine when additional mount points or servers should be introduced.
Administrative control (hereafter administrator) over the server includes being able to define the mount points it should use and specify whether any particular mount point should be an eligible target for new file creations. Marking a mount point as ineligible is particularly useful when it is to be removed from service where its enablement causes the system to effectively relocate data to other locations over time through attrition. An administrator may also perform mount point tests that perform a variety of input/output operations to verify their health and performance.
An administrator configures one or more managerial systems (hereafter manager) with the makeup of the servers, such as their network locations, and how they should be used as a whole, such as how they are expected to operate together. An administrator also configures the manager with the network locations of the clients, although it would be conceptually identical for this to be done in the reverse fashion where the clients are configured with the network locations of the managers. In either case, whichever has the network location of the other is responsible for establishing a connection in order to convey to the client the manager's configuration information. Multiple managers are designed to work together to exchange their information so that a failure of one does not stop the system as a whole from continuing normal function. The configuration includes one of two basic modes in which the servers are to be utilized by a client: standalone or mirrored. In either case a large, virtually singular pool of storage is created through the use of multiple servers.
Standalone mode means every server is treated as a independent member of the pool whereas mirrored mode means every two servers are paired and those mirrored pairs comprise the pool's membership. Mirrored mode logically pairs two similarly configured servers together as though they were a single entity meaning a write to this pair involves an identical copy of that data being received and stored on both servers. Mirroring involves designating that a pair of servers work independently but that clients cooperate towards producing equivalent data on both. The obvious cost of mirrored mode is that double the storage is required to store the same quantity of data that could be kept in standalone mode. The expectation for mirrored mode is that each server forming a pair is housed in a location apart from its counterpart for disaster recovery purposes. For example, one server might be located in close proximity to the client while the other might be placed across town. At the very least they should be supplied with independent sources of power.
Client connection to any server may occur across multiple network paths. For example, there might be more than a single network interface on both the client and the server capable of reaching one another. It would be expected each such network interface connects through independent means, such as individual routers, so that the loss of any single network component would not result in a complete loss of connectivity capability between the client and server. Connection is performed such that the path having the least activity is selected, an example of such measurement being available by way of the results from common ping command. In general, however, this can quite simply be effected by simultaneously making identically timed connection attempts across each of the available network interfaces and selecting whichever connection completes first. Regardless, the desired result is for the utilization of multiple network paths be as balanced as possible, albeit influenced by potentially varying network path speed maximums. Whenever a new connection is established the version levels of the server and client are exchanged allowing each to limit their conversation to those capabilities known to be available to the version of the partner to which they have connected. The server and client are coded to automatically suppress the use of features not recognized by their partner. An established connection is monitored by both the client and server and small ‘heartbeat’ packets issued after any idle period of one second. If either side detects an idle inbound period exceeding a configurable time period, such as 30 seconds, then the connection is terminated. In such a case the client will reattempt connection across all appropriate network interfaces. This provides the ability for a session to become switched from one network path to another in the event of any form of network path failure. The client maintains knowledge relating to the session state as it processes operation requests to the server and upon reconnection will restore that session state by reopening any necessary file, repositioning it appropriately and retrying whatever operation might have been in progress.
Just as balancing is performed for the use of network paths it is also performed by the servers when choosing which of its mount points to use to store a new file, spreading the load across them in as equal a manner as possible, influenced by the number of users accessing each specific mount point and their individual available space.
Notwithstanding the ability to define and use only a single server, a client employs the use of multiple servers when available and each is inspected and considered for use whenever a need exists to open or create a file. Whether or not mirroring is used, many servers may exist and balanced use is desirable based on factors such as number of active users, free disk space, network utilization and whether an instance of a specific file already exists. It is desired to balance, or spread, the load of multiple client systems across multiple servers. A file might, for example, be stored on a single server or it might be stored on multiple servers.
When a new file is to be created then all existing instances on all servers are first removed then the use and availability of all servers are considered to determine the most appropriate server(s) to use to store the file. Similarly, the first write to the start of an existing file that causes truncation results in that file being deleted from its current server(s) and recreated prior to completing the write so as to allow for a reconsideration of the best servers(s) for the file. Any time a newly created or modified file is closed a new unique identifier is produced which is stamped onto the file and stored in the manager, ensuring against the use of antiquated instances of a file which could arise through a server's outage.
When an existing file is to be opened then the entirety of servers are scanned to determine where the file is located. In the case where it is located on multiple servers, and when those servers are not bonded together to function as a mirrored server pair, pruning occurs to exclude secondary copies which are not as desirable based on such factors as the last write timestamp on the file, whether the file was properly closed and the file's unique identifier value, the latter of which is available from the manager.
When a mirrored server pair is selected for use, then the same file will be opened on each with instructions to write files being sent to each in order to result in the same file becoming produced on both. In contrast, however, reads need only be serviced by one of the two servers while the file pointer is simply kept in sync on the other. The particular server of the pair selected to service the reads, and thus incur a greater work load than the other, is selected by considering the usage of the two servers in order to balance the work loads. At the time a file is opened, the least busy of the two will be preferred to service read requests. In the event that a mirrored server fails, then all input/output will resort to the remaining server without interruption of the client.
For administrative purposes, it is possible for a client to specifically target a particular server and a mount point within it, overriding the balancing features of the system.
Due to the possibility of a server's outage, such as might happen if a network goes down or a server suffers a hardware failure, a checker component is periodically employed to scan all of the servers to ensure their health and consistency of data. For example, if an antiquated instance of a file is located it will be removed after the presence and integrity of a current instance of that same file is verified to exist. As another example, if a mirrored pair does not contain identical data then it will endeavor to reconcile any differences between them. The system, therefore, is self-healing and when such a copy is required then another component, the mover, is tasked with performing that operation.
Multiple of these otherwise independent distributed storage systems may be loosely coupled through the use of the mover component when simple network connectivity to both systems exists allowing it to perform copies between them. In this case the file criteria such as the last written date and unique identifier would be carried across so that it would remain identical. Such a copy could conceptually be automatically triggered upon detection by the manager of the completion of a file's creation or modification. The mover component would generally be employed for geographically distant installations and the network information relating to the address of the remote manager would be stored in the local manager for retrieval by the mover.
Also, the server strives to buffer information commonly requested by clients such as free storage space levels to reduce system load levels.
The present invention is supported via an encrypted communications protocol interfacing with, and relying upon, the teachings, practices and claims disclosed in U.S. Pat. No. 6,499,108 (hereinafter synonymously referred to as “Secure Agent®” or “SA”), which is incorporated herein by reference.
Secure Agent® Overview
The following overview is provided to facilitate a comprehensive understanding of the teachings of the instant invention. Secure Agent® utilizes a secure login sequence wherein a client connects to a Secure Agent® server using a key known to both systems and a client connects and presents the server with user identification (as used herein the term “client” refers synonymously to a remote user or component establishing, and communicating with the instant invention through Secure Agent® allocation and encryption processes as taught in the above noted applications). If recognized, the Secure Agent server initiates a protocol whereby the client's identification is verified and subsequent communication is conducted within a secured (encrypted) construct. For purposes of this overview, the term “server” should be considered a hardware configuration represented as a central processing unit wherein Secure Agent, a Host DLL and driver reside, and are executed. The term “DLL” as used herein refers to a Secure Agent host dynamically linked library (a.k.a. Host DLL). The term “DLL” or “dynamically linked library” is used in a manner consistent with that known to those skilled in the art. Specifically, the term “DLL” refers to a library of executable functions or data that can be used by a Windows™ or LINUX application. As such, the instant invention provides for one or more particular functions and program access to such functions by creating a static or dynamic link to the DLL of reference, with “static links” remaining constant during program execution and “dynamic links” created by the program as needed.
The Secure Agent® server presents a variable unit of data, such as the time of day, to the client as a challenge. The client must then encrypt that data and supply it back to the server. If the server is able to decrypt the data using the stored client's key so that the result matches the original unencrypted challenge data, the user is considered authenticated and the connection continue. The key is never passed between the two systems and is therefore never at risk of exposure.
The initial variable unit of data seeds the transmission of subsequent data so that the traffic for each client server session is unique. Further, each byte of data transmitted is influenced by the values of previously sent data. Therefore, the connection is secure across any communication passageway including public networks such as, but not limited to, the Internet. The distance between the client and server is not of consequence but is typically a remote connection. For accountability purposes, the actions of a client may be recorded (logged) to non-volatile storage at almost any detail level desired.
The access rights of each client (what the client is able to accomplish during a session) is governed by data stored on the Secure Agent® server to which the client is associated. As an example, such rights might encompass the ability to administer and utilize the services of the server system, which would, in turn, include capabilities such as adding new clients or components, changing a user's rights, transferring new code to the server, using a feature (or service) of the server and more.
Consequently, Secure Agent® allows for the transmission of new code to the server and for that code to be implemented upon demand by a client. Such dynamic, real-time implementation in turn, allows for the behavior of the server to be modified. It is to this behavior modification the instant invention addresses its teachings, and thereby advances the contemporary art.
As will be readily appreciated by those skilled in the art, though the instant invention utilizes encryption/decryption and code recognition technology associated with Secure Agent®, alternative technologies may be employed in support of the instant invention without departing from the disclosure, teachings and claims presented herein.
In one non-limiting embodiment, the invention's host information component provides tape catalog and tape mount information from the host processor by way of an emulator component device. The specific device may be any device type best suited for the facilities available to the host information component. Non-limiting examples include 3480, through special commands or sequences; 3286 printer emulation; or 3270 display emulation. Based on a unique communication sequence initiated by the host information component, this particular emulated device is able to recognize that it services the ‘control path’ and reacts accordingly.
The ‘control path’ between the host information component and the remainder of the invention is used to supply all information required from the host such as tapes to be scratched, tapes to be transmitted to vault, tape mount requests and tape retrieval (or recall) requests. The information relating to tape scratches, tape vaulting and tape retrieval is collected periodically by the host information component from the host processor's tape catalog. The information relating to tape mount requests is collected as they occur, either by intercepting an operator message or by otherwise hooking into a host processor's tape mount user exit, a method by which a utility may gain useful information. For a tape to be scratched, vaulted or recalled, the device correspondingly updates the virtual tape catalog. For a tape to be mounted, the device relays the mount request to the emulated tape drive indicated in the request, parsing the request as necessary per the host processor's tape mount request message format. If, for whatever reason, the tape mount cannot be satisfied, a message is sent up through the control path to the host information component in order that an operator message may be issued indicating the reason for being unable to service the request.
Additionally, status information maintained on behalf of the emulated tape device is updated to reflect the current status so that an administrator might be able to review it.