The present invention, in some embodiments thereof, relates to a snapshot mechanism for computer memory and, more particularly, but not exclusively, to a snapshot mechanism that provides an image of all or part of a computer memory frozen at a given time.
Computer memory organization is typically split into physical organization—physical disks and memory cards etc on the one hand and logical—volumes or logical partitions on the other hand. Snapshots provide instantaneous copies of the volume.
Snapshots, or zero time copies, are used for various purposes. These uses include taking copies for backups across different points in time, as well as using multiple copies of a database for different uses, for example production, Development, Testing, different copies for different users, copies for users of virtual machines, disaster recovery etc. Writable snapshots, sometimes known as clones, are used in virtualization infrastructure as multiple images of virtual machines. Snapshots may also be used as the internal mechanism to implement asynchronous replication between storage arrays.
Snapshot implementations are varied, and can be roughly categorized according to the following parameters:                A Read-Only vs. Writable        A Data Space Efficiency        A Metadata Space Efficiency        A Hierarchies & Restrictions        A Performance        
The following briefly considers each of the above-listed parameters and various implementations thereof.
Read-only snapshots are typically used for purposes such as backup. They cannot be written to and thus do not change. Hence there is thus no point in taking snapshots of read-only snapshots. Writable snapshots however, can be read from and written to just like ordinary volumes, and in some cases it makes sense to take a snapshot of a writable snapshot. Aside from backups, writable snapshots are the usual choice.
From the point of data efficiency, two types of solutions exist. The first is not data space efficient, which means that the snapshot uses up space according to the address space of the volume it was taken from, regardless of the number of writes performed to it or to its ancestor volume. The second is data space efficient, which means that only subsequent writes to the snapshot or its ancestor volume may take up space. However, if the granularity is for example 1 MByte, each subsequent write to a new 1 MByte address range will use up space equal to at least 1 MByte, even if only 4 KByte were actually written. Thus, data space efficient snapshot can be further divided into rough granularity and fine granularity variants.
Regarding metadata space efficiency, where metadata is to be interpreted as auxiliary structures (e.g. mapping tables) used by the snapshot mechanism, the question is again how much metadata is needed per snapshot. It is specifically important to be metadata space efficient when discussing short lived snapshots, since relatively few writes are made to these snapshots and to their ancestor volume during their lifetime. Thus, the metadata may often take more space than the data itself. This is also true in thinly provisioned memory systems and in systems which employ a data deduplication mechanism. Metadata space efficiency has several possible levels. Most systems consume metadata in accordance with the address size of the original volume. Thinly provisioned systems may consume metadata in accordance with the amount of data previously written to the original volume, which is still quite wasteful because both the original volume and the snapshot share this data until pieces of it are changed in one of them.
The next parameter is hierarchies and restriction. This refers to various restrictions such as the number of snapshots, the times in which they can be taken or removed, and the volumes/snapshots they can be taken from. The most restrictive form allows each volume to have a single snapshot taken at any given point in time. More lenient forms allow several snapshots (up to some limit) to be taken on the same volume only after various time intervals have elapsed from the last snapshot taken. Even more lenient forms allow snapshots to be taken on writable snapshots. Most systems have limits on the number of snapshots and the amount of space snapshots may take up.
The last parameter is performance. It can be further divided to read/write performance and to create/delete performance. Read performance usually depends on the number of reads from disk needed to retrieve a piece of data. Write performance depends on the number of operations which must be performed per user write. Popular approaches include “copy on write” which requires two writes and one read per user write and “redirect on write” which requires only one write. Again, “redirect on write” depends on the granularity of the mechanism. For example, if the granularity of the mapping table is 1 MByte, meaning each megabyte has a single pointer to a physical data location, a 4K user write will still require extra writes and reads to be performed in order to copy the megabyte of data encompassing the 4K block to the newly chosen location.
Creation and deletion of snapshots can also be performed at various efficiency levels. Creation of snapshots must be instant, but may contain prolonged asynchronous procedures which depend on the address space of the original volume or on the amount of data previously written to the original volume. Snapshot deletion is usually a factor of the data and metadata efficiencies, since all data and metadata must be deleted. In most cases, this requires going over metadata in accordance with the address space of the snapshot, or even the aggregated address space of all snapshots of this ancestor volume. Often, a prolonged snapshot deletion process delays the creation of new snapshots, due to a limit on the number of snapshots or a limit on the amount of space available. This delay may significantly increase the RPO (Recovery Point Objective) when using snapshots to implement asynchronous replication
Traditionally, memory is stored and accessed by assigning addresses to fixed size (e.g. 512B or 4 KB) chunks of data, as shown in FIG. 1. Each such address is in fact a number in some range, and the range indicates the amount of available memory to be written and read.
Thin Provisioning
Reference is now made to FIG. 2, which illustrates an enhancement of the above traditional storage known as thin provisioning. Under thinly provisioned systems, the range of addresses can be much larger than the actual amount of available memory. In this case, the address to data mapping array can be viewed as a sparse array, where many addresses do not have a value. In this case the mapping can be viewed as being from logical user address to internal physical location.
Data Deduplication
Reference is now made to FIG. 3 which is a simplified diagram showing an enhancement, of the thin provisioning of FIG. 2 to provide data deduplication. In some cases, identical chunks of data may appear under different addresses. To avoid holding two or more duplicate copies of the same data chunk (deduplication), another layer of indirection may be introduced. Instead of mapping multiple addresses to the same chunk of data, one can map addresses to much shorter hash digests of typically a few bytes, representing the 4K data chunks, and further map these hash digests to the internal physical location according to the actual data they represent.
Snapshots
Reference is now made to FIG. 4 which illustrates thin provisioning as per FIG. 3 but with a snapshot made of the thinly provisioned memory source.
It is noted that the snapshot of FIG. 4, is in fact a duplication of the original address range, whether the original address range is fully provisioned or only thinly provisioned. A snapshot is taken of a specific address range at a certain point in time, and inherits all the mappings of that address range. Furthermore, it may overwrite mappings and diverge from the source range (or volume) it was taken from.