Generating a snapshot of data is a method commonly used to preserve and protect data stored in a storage system. A snapshot is a record of data stored in a storage system at a selected moment in time. A snapshot may be used, for example, to recover an earlier version of the data in the event a current version becomes corrupted, or may be copied to another storage device to provide a backup copy of the data.
In many storage systems, performing a snapshot is a relatively simple operation. However, if a cache is used in the storage system, performing the snapshot often becomes more challenging. Typically, a cache functions as a buffer, i.e., data sent by the client server to disk is recorded first in the cache, and subsequently is flushed to the disk. A cache is typically constructed of a type of memory that can be accessed more rapidly than the primary storage devices in the system. Accordingly, a storage system can increase its efficiency by temporarily storing in the cache selected items of data that are likely to be requested by a client. Many storage systems, operating on the principle that data recently sent to disk have a high probability of being accessed again, uniformly transmit data to a cache before sending it to disk. As used herein, the term “disk” has the same meaning as “storage device;” accordingly, “stored on disk” means stored on a storage device, “sent to disk” means sent to a storage device, etc. “Client” and “client server” are used herein interchangeably.
In many storage systems, a cache processes data sequentially to preserve data integrity. Changes made by a client server to a data file, for example, are recorded in the cache in the order they were performed and flushed to disk in the same order. Where a cache receives data from multiple clients, the cache typically processes data from a given client sequentially; however, the cache may assign a higher priority to data received from one client over data received from other clients. In such a case, the cache may not adhere to a strict first-in-first-out mode of operation.
To preserve data integrity, a snapshot must capture and preserve all data sent to disk by the client server up to the precise moment the snapshot is requested. However, where a cache is used, some data sent to disk by a client before the snapshot request may remain in the cache, not yet having been flushed to disk, at the moment the snapshot request is made. In such case, performing a snapshot of data stored on disk at the moment the request is made would produce an inaccurate record because it would not capture the data remaining in the cache at that moment. To produce an accurate snapshot, it is necessary to incorporate the data from the cache into the snapshot.
One solution to this problem requires first directing the client server to suspend transmission of data to the storage system. All data in the cache is then flushed to disk, and finally the snapshot is performed. This method may be adequate if very little data is present in the cache at the moment the snapshot is requested. Otherwise, this method is often undesirable because the client server is required to wait for the data flush to finish before it can resume normal operations. This can represent a substantial inconvenience to the client server and compromise the efficiency of the data storage provider.
It is often useful to generate a snapshot of data before performing a data processing task that poses a risk of corrupting data in a storage system. For example, in many systems simply maintaining a backup copy of a primary disk can occasionally pose substantial risks. If, for example, the system employs asynchronous mirroring, i.e., a cache is used to temporarily store data written to the primary disk before writing to the mirroring disk, an interruption in the communication between the cache and the mirroring disk can cause data to be lost and the backup copy to become corrupted. Generally, in such case it is necessary to synchronize the mirroring disk with the primary disk, i.e., simply copy data sector-by-sector from the primary disk to the mirroring disk. However, if the primary disk becomes corrupted before the copying procedure is completed, then there may be no uncorrupted version of the primary disk left. Moreover, in such case, the data on the mirroring disk is often corrupted by an incomplete copy procedure. It is therefore often preferable to generate a snapshot of the mirroring disk immediately before attempting to synchronize the mirroring disk with the main disk.
Many existing storage systems fail to determine a suitable moment for taking a snapshot. This is partly due to the fact that, in many networks, a client server (such as a data server) manages the data processing and storage functions, and storage systems merely process requests received from the client server. Therefore, the client server, rather than the storage system, determines an appropriate time for a snapshot. However, in prior art systems, the client server does not have the capability to direct the storage system to perform a snapshot at a selected moment. Instead, many storage systems are configured simply to perform snapshots at predetermined intervals, e.g., every 30 minutes.