The present disclosure relates generally to multi-target environments and more specifically, to storage site selection in a multi-target environment using weights.
In general, contemporary multi-target systems provide a continuous availability solution for disks using selection and exchange operations. Selection and exchange operations are designed to eliminate disk failures as a source of application outages by allowing customers to specify a set of storage volumes to be synchronously mirrored. That is, in the event of a permanent input/output (I/O error) on a primary disk, I/O requests are automatically switched to a secondary disk, thereby masking the failure from an application (or system) and eliminating a need to restart the application after the failure. Selection and exchange operations can also be used to initiate a planned swap to the secondary disk to perform required maintenance on the primary disk.
The contemporary designs of selection and exchange operations support swapping between two storage sites only. So, if an unplanned selection and exchange trigger occurred due to a permanent I/O error on a primary storage site, applications would start using the secondary storage site. Depending on the cause of the I/O error on the primary storage site, delays can be experienced while identifying the cause of the selection and exchange, rectifying the problem, and/or re-establishing a synchronous mirroring relationship between the primary and secondary storage sites. During this time, the customer is exposed to a subsequent disk failure. Some customers further protect their data by creating a three site solution. In this case they synchronously mirror from site 1 to site 2, and then asynchronously mirror from site 2 to site 3. However site 3 cannot receive an exchange since this replication is asynchronous and therefore is not an identical copy of site 1 or site 2. In addition, asynchronous replication will expose the customer to the possibility of some data updates being lost.
Contemporary multi-target storage replication (contemporary replication) includes maintaining selection and exchange capability after an exchange. For example, the contemporary replication can include the ability to have a preferred site to switch over to in the event of an unplanned selection and exchange and the ability to move to the third site in the case the preferred site is not viable for a selection and exchange. In general, a target site is considered to be viable only if all members of a contemporary sysplex remain capable of accessing all of the target volumes. If even one system in these systems is unable to access even a single volume, it is considered non-viable. In this case, selection and exchange will attempt to switch to the less preferred (tertiary) site, if switching to that site will result in all systems in the contemporary sysplex maintaining access to all volumes. Further, contemporary replication systems fail to detail how to select a site to switch to in the case that none of the two (or more) potential targets provide an environment where all volumes are currently accessible by all systems in the contemporary sysplex. What the contemporary replication does in this case, based on customer policy, is either abort the selection and exchange or attempt it using the preferred target. In this case, at least one system in the contemporary sysplex is likely to fail or at least have some of its applications fail. It takes only the relative preference of the target storage site into account (e.g. the one that is most geographically close to the servers). It does not take into account the importance of the workload running on the servers.
For example, when swapping to either storage site will likely result in the failure of some server, the contemporary sysplex will switch to the preferred storage target, but does not consider that one storage site may allow a server running the most business critical work to continue, while a server running discretionary work would fail.