1. Field of the Invention
The present disclosure relates to performing copy and/or data management operations in a computer network and, in particular, to systems and methods for performing data replication in a storage management system.
2. Description of the Related Art
Computers have become an integral part of business operations such that many banks, insurance companies, brokerage firms, financial service providers, and a variety of other businesses rely on computer networks to store, manipulate, and display information that is constantly subject to change. Oftentimes, the success or failure of an important transaction may turn on the availability of information that is both accurate and current. Accordingly, businesses worldwide recognize the commercial value of their data and seek reliable, cost-effective ways to protect the information stored on their computer networks.
Many approaches to protecting data involve creating a copy of the data, such as backing up and/or replicating data on one or more storage devices. When creating a copy of such data, certain factors are generally considered. First, a copy of data should not contain data files that are corrupt or terminated improperly. Second, a copy of data should be current enough to avoid data staleness by avoiding too much time between copying such that the copied data is still useful should it be needed. For certain applications, such as networks that store financial transactions, copies a week old may be useless, and much more frequent copying may be needed.
In an attempt to accommodate such storage requirements, certain systems through all the files in a computer network, or through a selected set of critical files, and check the time information of each file. If data has been written to the file since the last time the system checked the file's status, then a copy of the file is sent to a storage system. One problem with such systems is that they typically do not work for data kept in very large files. For example, assuming that a copy could be made of the very large database, the time needed to make copies of such a large database may render data shadowing impractical. Making numerous copies of a large database not only takes a tremendous amount of time, but also requires a tremendous amount of storage space.
Another approach that has been attempted in order to overcome some of these limitations is a process whereby a time sequence of data is captured and saved. For example, many systems incorporate disk mirroring or duplexing. In disk mirroring or duplexing, changes made to a primary mass storage system are sent to other backup or secondary mass storage systems. In other words, when a data block is written to the primary mass storage system, the same data block is written to a separate secondary mass storage system. By copying each write operation to a second mass storage system, two mass storage systems may be kept synchronized so that they are virtually identical at approximately the same time. Because an entire disk volume is being copied, however, mirroring also requires a tremendous amount of storage space and utilizes a large amount of processing resources.
Furthermore, each of the above-described processes for copying or backing up data can have a significant impact on the source or primary system. For example, processing resources of the source system may be expended in copying data to a destination system rather than being used to process application requests.