A single logical operation (transaction) performed by an application may include several write operations (also referred to as writes) directed towards a disk layer of a storage system. For example, a filesystem that is hosted by an application layer of a storage system (or is hosted by a host computer that is coupled to the storage system) can perform file operations that involve writing data as well as writing metadata in the corresponding inode (inode being a metadata structure that includes metadata about the file). This may include writing new data to the file, adding pointers to the newly added data in a mapping data structure of the file, updating the size and timestamps of the file in its inode and optionally updating a timestamp of a parent directory.
The writes of a single transaction may be directed to a single logical volume or to a group of logical volumes (that form a consistency group). For example, the data of the file may be written in one logical volume while the metadata of the file may be written to a different logical volume of the same consistency group.
Storage systems may use various methods for keeping the integrity of the information they store. One method includes taking, at different points in time, snapshots of the same consistency group. The consistency group may include one or more logical volumes.
Snapshots of the same consistency group taken at different points of time are also referred as different snapshot versions. The term snapshot version refers to a unique identifier that uniquely identifies the specific snapshot at least among all the snapshots of the snapshot family of the same consistency group and is indicative of the time or order of creation relative to other snapshots in the family, for example, a newer snapshot is associated with a snapshot version that is larger than a snapshot version of an earlier created snapshot. The snapshot version may include for example a running index. Thus indexes 1, 2, . . . n may represent first till n'th snapshots (snapshot versions) SN(1), SN(2), . . . , SN(n) taken at points of time T(1), T(2), . . . T(n).
If a snapshot is taken between the writes of the transaction, the snapshot contains a partially executed transaction (incomplete transaction), which reflects an inconsistent state of the application stored data. A recovery based on such a snapshot can be erroneous. The inconsistency may exist whether the snapshot is of a single volume or the snapshot is of a group of volumes (that form a consistency group) involved in the transaction.
FIG. 1 is a timing diagram (1) that illustrates (a transaction inconsistent) snapshot version (n+1) {SN(n+1)} that includes data written by first till third writes of a transaction 8.
Version n snapshot (also referred to as SN(n)) was taken at time Tn 10. SN(n+1) was taken at time T(n+1) 20. SN(n+2) was taken at time T(n+2) 30.
Transaction 8 started at start point of time Tstart 11 and ends at Tend 23. Transaction 8 includes five writes—first till third writes taken at Tw1 12, Tw2 13, Tw3 14 (all between Tn 10 and T(n+1) 20) and fourth and fifth writes taken at Tw4 21 and Tw5 22 (between T(n+1) 20 and T(n+2) 30).
SN(n+1) includes only data corresponds to the first three writes of transaction 8 and thus is a transaction inconsistent snapshot.
FIG. 2 is a timing diagram 2 that illustrates a creation of transaction consistent snapshot at the expense of delaying the initialization of the creation of snapshots. Thus, SN(N+1) should have been taken at T(n+1) 20 but it was delayed (by delay 50) until transaction 8 is completed—and thus is actually taken at point of time T(n+1)′ 20′. In addition, new transactions (not shown) that are initiated between T(n+1) 20 and T(n+1)′ 20′ are withheld until after SN(N+1) is taken at T(n+1)′ 20′.
FIG. 2 may provide a partial representation of two known solutions for generating transaction consistent at the expense of delaying the initialization of the creation of snapshots and delaying the start of new transactions.
One known solution is called “Block and Drain”, where the application blocks new transactions from users before a snapshot is taken, drains all pending writes, and when all pending writes are completed, a snapshot is taken (SN(n+1) taken at T′(n+1) 20′ after being delayed by delay 50) and the blocking of new writes is removed. This causes a pause of up to few seconds in the workflow.
Another solution temporarily stores all the new incoming write requests in a temporary location, while draining pending writes and (once all pending writes are completed) taking a snapshot (SN(n+1) at T′(n+1) 20′ after being delayed by delay 50). After the draining is completed and a snapshot is taken, the temporarily stored incoming write requests are executed. While the users are not restrained from sending write requests, there is still a pause in the normal workflow as well as implementation complications.
There is a growing need to provide a system, method and a computer readable medium for providing transaction consistent snapshots.