1. Field of the Invention
This invention relates to remote replication data disaster recovery systems and more particularly relates to differencing and data compression in a read-before-write environment of a secondary storage to provide an efficient means for point-in-time disaster recovery of data.
2. Description of the Related Art
As financial, scientific, medical, and other critical data are being integrated with computers and computer networks, reliability and availability of the data is increasing in importance. Loss of data may have severe negative consequences for users of a computer system.
On-site backup systems are designed to reduce the possibility of data loss. Nevertheless, even with such systems in place, natural disasters such as fire, lightning, hurricanes, etc., and man made disasters such as civil unrest, computer hacker attacks, and terrorist attacks can also affect computer networks and on-site backup systems. Consequently, to preserve critical data backup systems are often located remotely. Distances from a few miles to thousands of miles are often required to overcome many disaster scenarios.
One type of data disaster recovery system maintains a mirror image of data on a primary data storage system at a remote site on a secondary data storage system. As files on a server are modified or added and then backed up on a primary data storage system, the changed blocks of data are identified and sent at particular time intervals to a secondary data storage system. The one or more data blocks that are identified as having been modified and that are sent together at the end of a time interval are referred to as a “color.”
At the secondary data storage system, once a color is received, the data blocks from the color are read into random access memory (RAM). In a read-before-write system, the corresponding data blocks in a secondary data storage device are read into RAM. Once the consistency and correctness of the data blocks from the color are verified, the data blocks are sent to the secondary data storage device and are inserted in place of the corresponding data blocks in the secondary data storage device. The data blocks in the color may be processed individually or multiple data blocks may be processed together. Once the data blocks from the color have been successfully processed, the older versions of the data blocks read into RAM are discarded.
In such a remote replication system in a read-before-write environment, maintaining older data is desirable because it allows a user to recover data to a particular point in time. Maintaining full copies of an entire data structure on a secondary data storage device is problematic due to the vast amount of data storage required. Other methods of providing point-in-time versions of the data involve saving the changed blocks or files, but are still problematic due to the amount of data storage required, and the need for metadata to keep track of the changes and timing of the changes to maintain consistency.
From the foregoing discussion, it should be apparent that a need exists for an apparatus, system, and method for providing an efficient creation of point-in-time versions of data in a read-before-write environment. Beneficially, such an apparatus, system, and method would maintain a current version of the data on the primary data storage system together with previously modified data in a compact format that would readily allow disaster recovery of data at a particular point in time.