In our modern communication age, commercial enterprises, consumers, and other entities are storing an ever increasing amount of digitized data. For example, many commercial entities are in the process of digitizing their business records and/or other data. Similarly, web based service providers generally engage in transactions that are primarily digital in nature. Thus, techniques and mechanisms that facilitate efficient and cost effective storage of vast amounts of digital data are being implemented.
In some modern data storage systems, data is formatted or organized as a file so that it can be stored within file systems and/or volumes. Since the data is digital or “digitized” (e.g., stored as bits of either 0 or 1), one or more (backup) copies of the data can be made relatively easily. When a copy of a data file is made (e.g., where the data is mirrored or replicated), the original file is at times referred to as the parent or source while the copy may be referred to as the child or destination, where the child is a lossless (e.g., bit for bit) snapshot or “mirror” of the source data taken at a particular point in time. Similarly, in such a scenario, the original file may be regarded as residing on or being stored within a parent or source volume while the copy may be regarded as residing on or being stored within a child or destination volume, where a volume generally corresponds to an amount of memory allocated for storing the data file.
It can be appreciated that in some situations it may be advantageous and/or otherwise desirable to maintain a copy of the data as it appeared at some point in time (e.g., as depicted in the source volume), while also being able to perform testing and/or other operations upon the data as it appeared at that same point in time (e.g., as depicted in a destination volume), where such testing or other operations may occur at the same and/or one or more later points in time, but do not affect the original or parent data. To facilitate availability of such data in the event of hardware, software, or even site failures (e.g., power outages, sabotage, natural disasters, etc.), entities have developed clustered networks that carry out mirroring techniques for data. These mirroring techniques can be a key component in data protection strategies for business entities.
In one example of a mirroring technique, data on a source volume is copied or “mirrored” between a source volume and one or more destination volumes, thereby creating a mirror of the data between the volumes. In this example, if there is a data access failure (e.g., data is inaccessible due to data loss, or some hardware/software failure) at the source volume, the data can be retrieved from the destination volume, and vice versa.
More particularly, for example, a clustered network, such as may be used by an enterprise for data storage and management, may comprise a source storage system located remotely from a destination storage system (e.g., the source systems is in Los Angeles and the destination system is located in New York). In this example, a client (e.g., which may be located anywhere), attached by a network to the cluster can store and retrieve data from the source volume (e.g., through a host, such as a server, in the source storage system). The source storage system can provide for data in the source volume to be replicated (mirrored) on the destination volume, which resides remotely from the source storage system. Periodically, the data on the destination volume can be updated to reflect the data on the source volume, such that the source volume and destination volume store the same data (e.g., mirrored data), albeit at different locations.
Further, in this example, if a natural disaster occurs (e.g., an earthquake in Los Angeles) that causes a hardware failure at the source storage system (e.g., power loss), a data access failure can occur at the source volume. The client can subsequently request access to the destination volume through a host in the destination storage system (e.g., in New York). Because the destination volume stores the same data as the source volume, the client can reliably retrieve the mirrored data even though access to the source volume has failed.
However, retrieving data from such destination volumes after a failure can take an inordinately long time, thereby undesirably tying-up computing resources. In particular, the time is partly due to the fact that, at the time of failure, the host in the destination storage system is not yet connected to the destination volume, or some subset of the data storage volume. Rather, the destination volume is only connected to the host in the source storage system, from which the data mirroring is initiated. Consequently, when a client learns of a failure and requests access to a destination volume through a host in the destination storage system, the client must wait for the destination host to connect to the destination volume (e.g., one or more logical unit numbers (LUNs) in the destination volume) before the client's desired data can be retrieved in full.
Consequently, when recovery includes a large number of destination volumes (or a single destination volume with a large number of LUNs), data retrieval can take an inordinately long time. Access to information is often time critical for commercial enterprises, and computing resources may be limited or impacted by such a delay. It would be desirable to improve the speed at which data can be recovered from one or more mirrored (destination) volumes.