The present invention relates to data storage systems and in particular to management of replication (“mirror”) volumes in a data storage system.
A data processing system in an enterprise typically requires a large amount of data storage. Customer data and data generated by users within the enterprise, occupy a great portion of this data storage. Any loss or compromise of such data can be catastrophic and severely impact the success of the business. Robust data processing systems provide back-up copies of the data to prevent such loss. To further protect the data, some data processing systems extend the practice of making back-up copies to provide disaster recovery. In disaster recovery systems, a back-up copy of the data is kept at a site remote from the primary storage location (sometimes referred to herein as a “production” storage location, or the like to reflect the nature of the data, i.e. production data, that is being stored). If a disaster strikes the primary storage location, the data can be recovered from the back-up copies located at the remote site.
A known method of providing disaster protection is to mirror, or shadow, the primary storage data at a remote storage site. Remote dual copy, or remote data duplexing, is one form of this data mirroring solution. In remote dual copy, remote storage devices are provided in the data processing system so that a copy of the primary data is written to an remote storage device. Storage devices are coupled together to form duplex pairs, each duplex pair consisting of a primary storage device and a secondary storage device. When data is written to the production volume (also referred to as “primary storage device”), the data processing system automatically copies the data to the mirror volume (or, “secondary storage device”). The mirror volume contains an exact physical image, or mirror, of the production volume. Typically, the production volume and the mirror volume have the same physical geometry, being configured and formatted in the same way as the production volume.
It is worth noting that “local” mirroring is also used for backup and recovery purposes, where the mirror volumes are located locally with the primary volume. Typically, local mirroring is used for backup and recovery, while remote mirroring is used for disaster recovery. Comparing to conventional tape backup, local mirroring is much faster but more expensive.
As the data storage capacity of an enterprise grows, storage administrative tasks become more complicated and more critical. Defining (or allocating) a new volume in a storage system is one of the most important tasks for storage administration in order to keep up with the data demands of the enterprise. As the data storage system increases so too does the complexity of a data storage manager subsystem for managing the primary (“production”) storage volumes and the backup mirror volumes. However in a large data storage facility, it can be rather difficult to select a volume to be mirrored, because:
The candidate mirror volume must not be in use.
The candidate mirror volume should be selected appropriately, for example,                (1) The candidate volume should not be in the same physical disks as the production volume being mirrored. If the physical disks fail, then both production volume and mirror volume would be lost.        (2) The candidate volume should be in physical disks that have performance and reliability characteristics comparable to physical disks comprising the production volume being mirrored.        (3) Although (2) is the basic rule and should be applied as a default, the performance and reliability of the candidate volume should be selectable by a user.        (4) The candidate volume should not be comprised of physical disks that are heavily loaded. For example, if the physical disks contain volumes allocated as production volumes, such disks should not be allocated to serve as a mirror volume.        
After a user finds a volume for a mirror, the user has to perform an operation referred to as “creating a mirror;” i.e. to initiate a mirroring operation between the primary volume and the mirror volume. This can be achieved in a command line interface, for example, by entering or otherwise specifying a command like:                createmirror vol1 vol2where vol1 is the production volume and vol2 is its mirror volume.        
Typing the command above would take time if the user has to create many mirrors. For example, in a real world setting, databases that consume more than several hundred gigabytes of storage are common; image and video databases can consume on the order of terabytes of storage. Typical implementations of such large storage facilities may require many tens to hundreds of production volumes. For example, a 50 volume configuration would require typing in the following 50 commands:
createmirror vol1 vol2createmirror vol3 vol4createmirror vol5 vol6. . .createmirror vo199 vol100
Applications which have large associated data objects (“application objects,” “data sets,” etc.) can benefit from mirroring techniques. One class of applications are database applications, where the associated data objects can span over multiple volumes in a large database. For example, a data object such as an instance of a database (an Oracle® database, for example) may comprise a multiplicity of data files that can be deployed across many primary volumes. In order to assure data recovery, the primary volumes which collectively store the data object should be mirrored. Since a database application can define many instances of a database, it might be desirable from a system management point of view to be able to mirror only those physical volumes which store a particular instance of a database.
Another class of applications sometimes referred to as a Logical Volume Manager (LVM), such as VxVM® produced and sold by Veritas Software Corporation, provide users with a logical view of the underlying and typically disparate collection of physical primary volumes. Again, error recovery can be provided by mirroring the primary volumes. Since, multiple logical volumes can be defined on the primary volumes, it might be desirable to be able to mirror only those primary volumes which constitute a particular logical volume. Thus for this type of software, the “data object” would be the logical volume presented to the user.
U.S. Pat. Nos. 5,459,857 and 5,544,347 disclose remote mirroring technology. Two disk systems at separate locations are connected a by remote link. The local disk system copies data on a local disk when pair creation is indicated. When a host updates data on the disk, the local disk system transfers the data to the remote disk system through the remote link. Thus no host operation is required to maintain a mirror of two volumes.
U.S. Pat. No. 5,933,653 discloses types of data transferring methods between the local and remote disk systems. In a synchronous mode transfer, the local disk system transfers data to the remote disk system before completing a write request from a host. In a semi-sync mode, the local disk system completes a write request and then transfers the write data to the remote disk system. Succeeding write requests are not processed until finishing the previous data transfer. In an adaptive copy mode, pending data to the remote disk system is stored in a memory and transferred to the remote disk system when the local disk system and/or remote links are available for the copy task.