The present invention relates generally to data communications systems and more particularly to mechanisms for improving the efficiency, stability, etc. of systems involved in the execution of concurrent extended copy tasks.
The NCITS T10 SPC-2 (SCSI Primary Commands-2) Extended Copy command provides a method for computer backup applications to delegate actual data movement to third party devices known as copy managers. These copy managers typically reside in microprocessor-based, mass-storage related devices attached to a storage network. For example, a copy manager may reside in a router in a storage area network (SAN).
Copy managers move data from source devices to destination devices as designated by the backup application. The copy manager accepts extended copy commands, interprets them and generates the read and write commands necessary to carry out the extended copy command. Thus, for example, a copy manager in a router may read data from a disk drive into buffers in the router, and then write the data from the buffers to a tape drive. The tape drive is often on a SCSI bus, but it may also be resident on a Fibre Channel, iSCSI, Infiniband or alternative protocol network. The source disk may likewise be resident on a Fibre Channel, SCSI, iSCSI, Infiniband or alternative protocol network. While typical implementations are backup systems which may backup data from disk to tape, or which may restore from tape to disk, the invention is independent of the types of devices between which the data is moved, or the protocols according to which these devices operate.
The standard for the extended copy commands allows copy managers to handle multiple, concurrent extended copy sessions. In other words, a copy manager can orchestrate the movement of data from a multiple source devices to multiple destination devices concurrently.
The maximum number of concurrent extended copy sessions that can be supported is constrained by the capabilities of the hardware platform on which the copy manager runs. These capabilities relate to, e.g., microprocessor type and speed, internal data bus speed, and the amount of memory available for buffering the data in transit. Thus, even when the same software application is used to manage the extended copy sessions, the number of concurrent extended copy commands which are supported must be tailored to the capabilities of the underlying hardware. Additionally, the number of concurrent extended copy sessions may be affected by the level of non-extended-copy activity on the platform.
In order to make use of the copy manager, a host device must be made aware of the availability of the copy manager. The copy manager is therefore configured to provide an indication of its availability to the host devices that may access it. This information generally includes an indication of the number of concurrent extended copy tasks that may be executed. The indicated number is based upon the resources of the copy manager""s platform, and is normally static. The host devices can then send up to the allowed number of concurrent extended copy commands to the copy manager for execution.
One of the problems with this arrangement is that, while the indication of the number of concurrent extended copy commands that can be handled by the copy manager is static, the actual availability of resources is not. In other words, conventional systems have static allocations of system resources designated for use by tasks that may become active within the system. The extended copy commands may require more or less than the allocated amount of resources. Extended copy commands may be sent to the copy manager by several host devices (which are not aware that they have to share the copy manager with other host devices), and the amount of resources used by non-extended copy commands may vary.
The fact that the copy manager may not have the expected availability of resources may cause problems because the copy manager typically has no means to cope with the situation. The copy manager is typically configured to execute the extended copy commands as they are received and may consequently become overloaded. This may cause instability or failure of the system.
One or more of the problems outlined above may be solved by the various embodiments of the invention. Broadly speaking, the invention comprises systems and methods for improving the performance and reliability of a copy manager by dynamically controlling concurrent extended copy tasks. In one embodiment, the number of concurrent extended copy commands which are allowed to be active within a copy manager is dynamically adjusted based upon total number of data buffers available to store in-transit data. The total number of data buffers available, together with the number of buffers required to execute a single extended copy command, may be used to constrain the number of concurrent extended copy commands which are allowed to be active in the copy manager. Extended copy commands which are not active are put in a queue for later execution.
One embodiment of the invention comprises an improved method for handling extended copy commands. This embodiment is implemented in a device such as a router. The router is initially in an idle state. When an extended copy command is received, the router determines the level of resources that are currently available. If there are sufficient resources to service the received extended copy command, the extended copy command is made active, and the corresponding extended copy tasks are carried out by the router. If the available resources are not sufficient to service the received extended copy command, the extended copy command is placed in a hold queue, where it remains until there are sufficient resources. When an active extended copy command is completed by the router, the resources that were being used to service the completed extended copy command are freed for use by other tasks. The router therefore checks the hold queue to determine whether there are any extended copy commands waiting to become active. If there are more extended copy commands to be executed, the router again checks to see if enough resources are available and, if there are enough resources, the next extended copy command is made active. The number of extended copy commands which are active is constrained by the availability of resources.
Another embodiment of the invention may comprise a router or other device configured to operate in accordance with the foregoing method. An exemplary prior art system has 64 entries in a queue. Extended copy command entries are activated eight at a time, without regard to the amount of resources used by any one of the commands. As a result, the system may underutilize or overutilize its resources, depending upon the amount of resources needed by the active extended copy commands. An exemplary embodiment of the present system also has 64 entries, but it does not necessarily activate eight entries at a time. It may have more or less than eight active extended copy commands, depending upon the amount of resources used by the commands. For example, if there are only two active extended copy commands, but each of these commands uses an extraordinarily large amount of resources, this may be the greatest possible number of active extended copy commands. If, on the other hand, there are eight extended copy commands which are active and each of which uses an unusually small amount of resources, it may be possible to make a number of additional extended copy commands active. This would allow the system to make use of the resources that would otherwise be unused by the first eight active extended copy commands.
Another embodiment of the invention comprises a software application. The software application is embodied in a computer-readable medium such as a floppy disk, CD-ROM, DVD-ROM, RAM, ROM, database schemas and the like. The computer readable medium contains instructions which are configured to cause a computer to execute a method which is generally as described above. It should be noted that the computer readable medium may comprise a RAM or other memory which forms part of a computer system. The computer system would thereby be enabled to perform a method in accordance with the present disclosure and is believed to be within the scope of the appended claims.
The present systems and methods may provide a number of advantages over prior art solutions. For example, the dynamic (rather than static) control of extended copy tasks may enable copy managers to handle more tasks if the tasks use less resources than might be expected. There is also less risk of the copy manager stalling if more tasks come in than can be concurrently handled. This makes the copy manager more stable and may therefore keep the device in which it is resident in its optimal performance mode. Still other advantages will be apparent to those of skill in the art.
Another embodiment comprises a method in which the maximum number of extended copy commands allowed to be active is limited by performance characteristics of the microprocessor, including, but not limited to the processor type, clock speed, amount of on-chip cache, manufacturing geometry, and L2 cache.
Numerous additional embodiments are also possible.