Storing and safeguarding electronic data is of paramount importance in modern business. Various systems have been employed to protect such electronic data. Data storage systems can fall into a plurality of categories, such as Network Attached Storage (NAS) and Storage Area Networks (SAN). A NAS system can be a stand-alone, network-accessible, storage device that can provide file-level access to electronic data. A SAN array can be a dedicated storage system that can connect numerous storage resources to one or many hosts. A SAN can provide block-level access to electronic data through one or more SCSI-based protocols (e.g., Fiber Channel or iSCSI), which can be used by a connected host to provide a file system.
Data storage systems can be employed that contain multiple data storage devices. Data storage systems can provide some level of redundancy by use of mirrored or redundant components (e.g., storage devices, disk drives, disk controllers, power supplies and/or fans), each of which can be hot-swappable to avoid downtime.
One approach has been the development of data storage systems that behave as if they were a single storage device, but are in reality multiple storage devices that collectively operate together. From the perspective of a host that issues commands to read, write, copy, and/or allocate data storage, these devices appear as a single data storage entity. Examples of such data storage devices include disk storage arrays and network storage arrays.
A collection of data storage devices acting in concert can be configured to operate in a way that provides redundancy. For example, multiple disks can be organized into redundant arrays of inexpensive disks (RAID) groups. RAID groups can provide mirroring or other forms of duplication wherein data written to one disk is also written to another disk as a backup copy. RAID groups can also distribute data across many disks so that if one of the disks fails, there is enough data left on the other disks to reconstruct the missing data. RAID groups can perform a combination of striping and mirroring. Similarly, data storage devices can provide mirroring wherein data written to one data storage device is also written to another data storage device as a backup copy.
A common problem that arises in data storage systems that perform mirroring or other forms of backup or duplication is the delay incurred when transferring data between mirrored data storage devices. Mirroring systems typically have at least two data storage devices, e.g., two disks, two RAID groups, etc., one to store the data and the other to store the backup copy of the data, commonly referred to as the primary and secondary data storage devices, respectively. In order to provide a backup copy of data, any data that is stored on the primary data storage device must also be stored on the secondary data storage device.
In some applications, data can be copied from the primary data storage device to another location on the primary data storage device. Conventional approaches to mirroring perform the copy on the primary data storage device. If the destination of the copied data is mirrored by the secondary data storage device, the copied data is sent to the secondary data storage device. Sending all of the data to be mirrored from the primary data storage device to the secondary data storage device can involve significant transmission times for storage devices joined by slow network connections or separated by large distances. Accordingly, there exists a need for methods, computer readable media, and systems for reducing the amount of data transmitted between mirrored data storage devices.