A storage system, such as a storage array system or a storage area network of storage devices, can be used to store a relatively large amount of data on behalf of an enterprise (e.g., company, business, government agency, etc.). Data and software associated with users and applications can be stored in such a storage system, such that reduced local storage resources can be provided at user terminals.
Protection of data stored in a storage system is desirable. One aspect of data protection is to enable recovery of data in case of faults or corruption due to hardware and/or software failures, or data corruption or loss caused by malware attacks (e.g., attacks caused by a virus or other malicious code designed to cause damage to stored data).
One type of technique that has been used to protect data stored in a storage system is to create point-in-time copies of data as such data is modified by write operations. Point-in-time copies of data are also referred to as “snapshots.” A snapshot can be created when a write occurs. In a snapshot-based storage system, original data can be kept in a source volume of data. Prior to modification of data in the source volume, a snapshot of the data to be modified can be taken. Many snapshots can be taken over time as writes are received at the storage system. If recovery of data is desired for any reason, one or more of the snapshots can be used to recover data back to a prior state, such as before a point in time when corruption or data loss occurred.
There are different algorithms for performing snapshots of data. A first type of snapshot algorithm is referred to as a “copy-on-write” (CoW) snapshot algorithm, in which a write of data causes the storage system to copy the original data from the source volume to a snapshot volume before proceeding with the write. With the copy-on-write snapshot algorithm, the original version of the data is kept in the snapshot volume, whereas the modified version of the data is kept in the source volume.
A second type of snapshot algorithm is a “redirect-on-write” (RoW) snapshot algorithm, in which the write data is redirected to another location (“redirect-on-write location”) that is set aside for a snapshot, while the source volume maintains an original version of the data. The redirect-on-write snapshot algorithm effectively defers the taking of a snapshot until a later point in time—at a later point in time, snapshots of original versions of data present in the source volume are taken, with the modified versions of the data moved to the source volume from the redirect-on-write location.
Typically, a storage system uses just one type of snapshot algorithm (e.g., copy-on-write snapshot algorithm or redirect-on-write operation) to create snapshots in response to writes to data in a storage system. Under certain conditions, use of just a single snapshot algorithm in creating snapshots can result in reduced performance of a storage system. For example, with the copy-on-write snapshot algorithm, a copy penalty is associated with each data write, since the original version copy of the data has to be first copied to the snapshot volume before the data in the source volume is modified. On the other hand, although the redirect-on-write snapshot algorithm avoids the copy penalty immediately after a write occurs, tracking of data and data reconciliation can be more complex.