1. Field of the Invention
The invention relates generally to a system and method for storing data, and more particularly, to a method and system for backing up data stored on a primary storage system.
2. Description of the Related Art
In many computing environments, large amounts of data are written to and retrieved from storage devices connected to one or more computers. As more data is stored on and accessed from storage devices, it becomes increasingly difficult to reproduce data if the storage devices fail. One way of protecting data is by backing up the data on backup media (e.g., tapes or disks).
One well-known approach to backing up data is to produce the backup copy at or near the site of the primary storage device, and then move the backup media to a safe location, where they are stored. For example, a technique known as mirroring can use this arrangement to generate a backup version of data. Specifically, in accordance with one form of mirroring, referred to as “synchronous mirroring,” a first copy of data received from a client is written directly to an assigned location on the primary system, and a second copy of the data is written directly on the backup system. Typically, in a synchronous mirroring arrangement, additional data processing requests received from the client are not processed until the data is successfully stored on both systems.
Where the primary system and backup system are located at the same site, or are linked by a high-speed communication link, synchronous mirroring techniques can offer near instantaneous, up-to-date backup service. However, because transmissions from the client cannot be processed while data is being stored on the primary system and on the backup system, under some conditions, a synchronous mirroring arrangement can create significant delays for the client. If, for example, the backup system is located at a remote site, and/or the communication link from the primary system to the backup system is slow or susceptible to interruptions in service, the resultant delay may be unacceptable.
More recently, with the increasing availability of high-speed communication links, and of networking technology, it has become more common to locate a backup storage system at a site that is remote from the primary storage system. Since the advent of such remote storage systems, alternative approaches to mirroring have been developed.
One such method is known as “asynchronous mirroring.” According to this method, data received from a client is inserted into a cache memory, which stores the data temporarily. After the data is inserted into the cache, additional data processing requests received from the client are processed. The data is flushed from the cache and stored on the primary system. Subsequently, when system resources allow, the data is flushed to the backup system and stored.
Because data processing requests, including data write commands, received from the client can be processed before the data is written to both the primary and backup systems, asynchronous mirroring offers more convenience to the client. However, the remote location of the backup system carries with it unique risks. For example, if a problem occurs in the communication link between the primary and backup systems and prevents data from being transmitted from the cache memory to the backup system, the data stored on the backup system may soon become out-of-sync with the data stored on the primary system.
A third method for backing up data is known as the delta replication method. Delta replication is not a form of mirroring. Using this approach, when data is received from a client, it is written to its assigned location on the primary system, but is not sent to the backup system. Instead, a record is kept of the data blocks in the primary system that are overwritten. From time to time a “delta replication” is performed, which includes copying to the backup system only those data blocks that have been overwritten.
Several disadvantages are associated with the delta replication method. If a problem occurs in the communication link between the primary system and the backup system during a delta replication, the backup copy stored on the backup system may become corrupted. To mitigate this risk, a snapshot of the backup disk is often generated immediately prior to performing each delta replication. Similarly, if the data stored on the primary system becomes corrupted during a delta replication (e.g., due to a hardware failure), then the only up-to-date version of the data would be lost. For this reason, a snapshot of the data stored on the primary system is often performed immediately prior to each delta replication.