An Active/Active continuous availability environment describes a network of independent processing nodes, in which each of the nodes has access to a replicated database giving each of the nodes access and usage of a single application. In an Active/Active continuous availability environment, all requests are load-balanced across all available processing capacity. In an Active/Active continuous availability environment, mission-critical applications, known as ‘workloads’, may be deployed on two or more sites, separated by great distances, at the same time. A site, may be one or more computer systems, typically in a cluster configuration, within a physical region, and the region scope may be within a building or within a geographic location (e.g., a city).
The granularity of processing for Active/Active continuous is the workload. Workloads are synchronized, across sites, using replication products in an asynchronous, loosely-coupled fashion. Inbound On-Line Transaction Processing (OLTP) work is routed to a particular instance of a workload at one of the sites, and monitoring software provides for both the monitoring of workload health, as well as, the detection of workload outages. Lastly, user policies are prescribed on a workload basis. Workloads are defined as the set of applications that perform updates against a given set of data objects, using the same network addressability objects. Data objects may be described as a logically related set of records, or rows, within a given logical or physical construct. Examples of data objects include constructs of files, tables or database instances.
One of the tasks in preparing for an Active/Active continuous deployment is the identification of the workloads. In particular, the task is identifying the set of data objects (e.g., data sets, files, tables, databases, etc.) that belong to each workload and, as such, constitute a ‘consistency group’. The term, consistency group, refers to the assumed requirement that updates to all objects within the group are executed such that the target copy is an exact replica of the source copy, at a given point in time.
A difficulty with defining the consistency group (and by extension, a workload), is that large enterprises struggle with maintaining accurate descriptions of their application's usage and access patterns. It is common for customers to default to defining a single workload, encompassing all of the data objects in the enterprise.
The problem of workload discovery is not limited to Active/Active continuous environments, but is a problem for most, if not all, software replication scenarios. The definition of the smallest set of related objects, for the purpose of point-in-time consistency, is common to heterogeneous replication, homogeneous replication, and many event publishing replication use-cases.