Computer systems may include different resources used by one or more host processors. Resources and host processors in a computer system may be interconnected by one or more communication connections. These resources may include, for example, data storage devices such as those included in the data storage systems manufactured by Dell EMC®. These data storage systems may be coupled to one or more servers or host processors and provide storage services to each host processor. Multiple data storage systems from one or more different vendors may be connected and may provide common data storage for one or more host processors in a computer system.
A host processor may perform a variety of data processing tasks and operations using a data storage system. For example, a host processor may perform basic system input/output (I/O) operations in connection with data requests, such as data read and write operations.
Host systems may store and retrieve data using a storage device containing a plurality of host interface units, disk drives, and disk interface units. The host systems access the storage device through a plurality of channels provided therewith. Host systems provide data and access control information through the channels to the storage device and the storage device provides data to the host systems also through the channels. The host systems do not address the disk drives of the storage device directly, but rather, access what appears to the host systems as a plurality of logical disk units. The logical disk units may or may not correspond to the actual disk drives. Allowing multiple host systems to access the single storage device unit allows the host systems to share data in the device. In order to facilitate sharing of the data on the device, additional software on the data storage systems may also be used.
Data storage systems, hosts, and other components may be interconnected by one or more communication connections such as in a network configuration. The network may support transmissions in accordance with well-known protocols such as TCP/IP (Transmission Control Protocol/Internet Protocol), UDP (User Datagram Protocol), and the like. Networked storage systems, such as data storage arrays, may be used to maintain data on different systems in different locations. For example, in some implementations, a local or source data site may be configured in a partner relationship with a remote or destination source data site. The remote data site includes a mirror or copy of the data in the local data site. Such mirroring may be used for a variety of reasons including reducing the likelihood of data loss. Mirroring is a form of replication, in which data on the first storage device is replicated on the second storage device.
The time it takes to perform data replication depends in part on the time it takes to transmit the data being replicated between the primary data site and remote data site, and the time it takes to transmit the data being replicated depends in part on the size of the data. Thus, it may be desirable to reduce the size of the data being replicated (without losing any data) to reduce data replication times.