Data disaster recovery, also known as remote data replication technologies, refers to setup of a non-local data system which is an available replication of local data. When a disaster occurs in local data or an entire application system, at least one available copy of essential service data of the system is stored non-locally.
A typical data disaster recovery system includes a production center and a disaster recovery center. In the production center, hosts and a storage array are deployed for normal operation of services; and in the disaster recovery center, hosts and a storage array are deployed to take over services of the production center after the production center encounters a disaster. The storage array of either the production center or the disaster recovery center includes multiple data volumes and a data volume is logical storage space formed by mapping physical storage space. After data generated by services in the production center is written to the production array, the data can be replicated to the disaster recovery center by using a disaster recovery link and written to the disaster recovery array. In order that the data in the disaster recovery center can support takeover of services after occurrence of a disaster, consistency of the data replicated to the disaster recovery array must guaranteed. Guarantee of data consistency in nature is dependency-based write data requests, where the dependency needs to be guaranteed. Application programs, operating systems, and databases all inherently rely on logic of this dependency of write data requests to run their services. For example, write data request 2 is not executed until write data request 1 is complete. The order is fixed. That is to say, the system will not deliver write data request 2 until it is ensured that write data request 1 is returned successfully and completely. In this way, services can be restored by relying on an inherent method when an execution process is interrupted because of a failure. Otherwise, it is possible that, for example, when data is read, data stored by write data request 2 can be read while data stored by write data request 1 cannot be read, and as a result, services cannot be restored.
In the prior art, a snapshot technology is used to solve the problem. A snapshot is an image of data at a time point (time point when copying is started). The purpose of a snapshot is to create a state view for a data volume at a specific time point. From this view, only data of the data volume at the time of creation can be viewed while modifications (new data is written) to the data volume after the time point will not be reflected in the snapshot view. By using this snapshot view, replication of data can be implemented. For the production center, snapshot data is “static”. Therefore, the production center can replicate snapshot data to the disaster recovery center after a data snapshot is taken at each time point. This not only implements remote data replication but will not impact execution of a subsequent write data request in the production center. For the disaster recovery center, the requirement of data consistency may also be satisfied. For example, when data of write data request 2 is replicated to the disaster recovery center successfully while data of write data request 1 is not replicated successfully, snapshot data before write data request 2 can be used to restore the data in the disaster recovery center to a previous state.
Because the production center needs to process a snapshot when executing a write data request, and store the generated snapshot data in a data volume dedicated for storage of snapshot data, when the production center replicates the snapshot data to the disaster recovery center, it is necessary to read the snapshot data stored in the data volume to a cache and then send the snapshot data to the disaster recovery center. The data used to generate the snapshot data, however, possibly still exists in the cache but cannot be utilized properly. Every replication requires reading snapshot data from the data volume, which results in long data replication and low efficiency.