The present invention relates to data storage systems and in particular to management of replication (“mirror”) volumes in a data storage system.
A data processing system in an enterprise typically requires a large amount of data storage. Customer data and data generated by users within the enterprise, occupy a great portion of this data storage. Any loss or compromise of such data can be catastrophic and severely impact the success of the business. Robust data processing systems provide back-up copies of the data to prevent such loss. To further protect the data, some data processing systems extend the practice of making back-up copies to provide disaster recovery. In disaster recovery systems, a back-up copy of the data is kept at a site remote from the primary storage location (sometimes referred to herein as a “production” storage location, or the like to reflect the nature of the data, i.e. production data, that is being stored). If a disaster strikes the primary storage location, the data can be recovered from the back-up copies located at the remote site.
A known method of providing disaster protection is to mirror, or shadow, the primary storage data at a remote storage site. Remote dual copy, or remote data duplexing, is one form of this data mirroring solution. In remote dual copy, remote storage devices are provided in the data processing system so that a copy of the primary data is written to a remote storage device. Storage devices are coupled together to form duplex pairs, each duplex pair consisting of a primary storage device and a secondary storage device. When data is written to the production volume (also referred to as “primary storage device”), the data processing system automatically copies the data to the mirror volume (or, “secondary storage device”). The mirror volume contains an exact physical image, or mirror, of the production volume. Typically, the production volume and the mirror volume have the same physical geometry, being configured and formatted in the same way as the production volume.
It is worth noting that “local” mirroring is also used for backup and recovery purposes, where the mirror volumes are located locally with the primary volume. Typically, local mirroring is used for backup and recovery, while remote mirroring is used for disaster recovery. Comparing to conventional tape backup, local mirroring is much faster but more expensive.
As the data storage capacity of an enterprise grows, storage administrative tasks become more complicated and more critical. Defining (or allocating) a new volume in a storage system is one of the most important tasks for storage administration in order to keep up with the data demands of the enterprise. As the data storage system increases so too does the complexity of a data storage manager subsystem for managing the primary (“production”) storage volumes and the backup mirror volumes. However in a large data storage facility, it can be rather difficult to select a volume to be mirrored, because:                The candidate mirror volume must not be in use.        The candidate mirror volume should be selected appropriately, for example,                    (1) The candidate volume should not be in the same physical disks as the production volume being mirrored. If the physical disks fail, then both production volume and mirror volume would be lost.            (2) The candidate volume should be in physical disks that have performance and reliability characteristics comparable to physical disks comprising the production volume being mirrored.            (3) Although (2) is the basic rule and should be applied as a default, the performance and reliability of the candidate volume should be selectable by a user.            (4) The candidate volume should not be comprised of physical disks that are heavily loaded. For example, if the physical disks contain volumes allocated as production volumes, such disks should not be allocated to serve as a mirror volume.                        
After a user finds a volume for a mirror, the user has to perform an operation referred to as “creating a mirror;” i.e. to initiate a mirroring operation between the primary volume and the mirror volume. This can be achieved in a command line interface, for example, by entering or otherwise specifying a command like:                createmirror vol1 vol2where vol1 is the production volume and vol2 is its mirror volume.        
Typing the command above would take time if the user has to create many mirrors. For example, in a real world setting, databases that consume more than several hundred gigabytes of storage are common; image and video databases can consume on the order of terabytes of storage. Typical implementations of such large storage facilities may require many tens to hundreds of production volumes. For example, a 50 volume configuration would require typing in the following 50 commands:                createmirror vol1 vol2        createmirror vol3 vol4        createmirror vol5 vol6        createmirror vol99 vol100        
Applications which have large associated data objects (“application objects,” “data sets,” etc.) can benefit from mirroring techniques. One class of applications are database applications, where the associated data objects can span over multiple volumes in a large database. For example, a data object such as an instance of a database (an Oracle® database, for example) may comprise a multiplicity of data files that can be deployed across many primary volumes. In order to assure data recovery, the primary volumes which collectively store the data object should be mirrored. Since a database application can define many instances of a database, it might be desirable from a system management point of view to be able to mirror only those physical volumes which store a particular instance of a database.
Another class of applications sometimes referred to as a Logical Volume Manager (LVM), such as VxVM® produced and sold by Veritas Software Corporation, provide users with a logical view of the underlying and typically disparate collection of physical primary volumes. Again, error recovery can be provided by mirroring the primary volumes. Since, multiple logical volumes can be defined on the primary volumes, it might be desirable to be able to mirror only those primary volumes which constitute a particular logical volume. Thus for this type of software, the “data object” would be the logical volume presented to the user.
U.S. Pat. Nos. 5,459,857 and 5,544,347 disclose remote mirroring technology. Two disk systems at separate locations are connected a by remote link. The local disk system copies data on a local disk when pair creation is indicated. When a host updates data on the disk, the local disk system transfers the data to the remote disk system through the remote link. Thus no host operation is required to maintain a mirror of two volumes.
U.S. Pat. No. 5,933,653 discloses types of data transferring methods between the local and remote disk systems. In a synchronous mode transfer, the local disk system transfers data to the remote disk system before completing a write request from a host. In a semi-sync mode, the local disk system completes a write request and then transfers the write data to the remote disk system. Succeeding write requests are not processed until finishing the previous data transfer. In an adaptive copy mode, pending data to the remote disk system is stored in a memory and transferred to the remote disk system when the local disk system and/or remote links are available for the copy task.
In a remote copy architecture, additional factors must be considered when performing mirror site selection, which can further burden the system administrator. For example, in order to create a mirror to a remote volume, there must be a connection between the local and remote copy storage systems. For disaster recovery purposes, the two storage systems should be separated by a sufficient distance to permit recovery if a disaster should befall the primary volume. On the other hand, the remote copy performance must be balanced against the desired level of reliability during recovery. When a fail-over occurs, the remote storage system which serves as the standby storage system must be accessible by the servers at the failed site. In addition, a suitable operating system must be available, the proper suite of application programs must be installed, and so on, must be available in the standby system.
It can be appreciated that there is a need to effectively manage large and widely distributed storage facilities. There is a need to facilitate the system administrator's storage management tasks.