Enterprise storage systems currently available are proprietary storage appliances that integrate the storage controller functions and the storage media into the same physical unit. This centralized model makes it harder to independently scale the storage systems' capacity, performance and cost. Users can get tied to one expensive appliance without the flexibility of adapting it to different application requirements that may change over time. For small and medium scale enterprise, this may require huge upfront capital cost. For larger enterprise datacenters, new storage appliances are added as the storage capacity and performance requirements increase. These operate in silos and impose significant management overheads.
Enterprise storage system can support snapshots, a snapshot is a read-only copy of file at a given point in time. Snapshots have a variety of uses: recovering accidentally deleted files, reverting back to known good state of the file-system after corruption, data mining, backup, and more. Clones, or writeable-snapshots, are an extension of the snapshot concept where the snapshot can also be overwritten with new data. A clone can be used to create a point in time copy of an existing volume and to try out experimental software. In a virtualized environment (e.g. VMWARE, VSPHERE), the whole virtual machine could be snapshotted by dumping the entire virtual machine state (memory, CPU, disks, etc.) to a set of files.
Snapshots are supported by many file-systems. Clones are a more recent user requirement that is widely being adopted especially in a virtualized environment (e.g., virtual desktop infrastructure (VDI)). Continuous Data Protection (CDP) is a general name for storage systems that have the ability to keep the entire history of a volume. CDP systems support two main operations: (1) going back in time to any point in history in read-only mode; and (2) reverting back to a previous point in time and continuing update from there. CDP systems differ in the granularity of the history they keep. Some systems are able to provide a granularity of every input/output (I/O), others, per second, still others, per hour. Supporting a large number of clones is a method of implementing coarse granularity CDP.
There are many different approaches known for performing snapshots (e.g., redo logs, full snapshots/clones, linked clones, and refcounting). For example, redo logging involves creating a differencing disk for each snapshot taken. For redo logging, all the new updates would get logged into the differencing disk and current disk would be served in the read only mode. This approach has the following limitations in that, for every read operation, the storage system first has to lookup if the differencing disk has the data, else it looks for the data in it's parent and so on. Each of these lookup operations is an on-disk read in the worst case, which means as the number of the snapshots increase the read performance would degrade. In addition, redo logging creates an artificial limit set on the number of snapshots that could be created in order to make sure that performance does not degrade any further. Further, for redo logging, deletion of snapshot is expensive because snapshot deletion involves consolidating the differencing-disk to it's parent, which is an expensive operation and has more space requirements at least temporarily. The consolidation is done in order to not impact the read performance degradation mentioned above. In addition, redo logging involves overwriting the same data again taking more space.
Another way to support snapshots is to create a complete new copy of the file/volume. This is both expensive and consumes a lot of space. Doing compression and deduplication could still save space but the creation time is the main bottleneck for this approach.
Using linked clones is another mechanism for supporting clones of virtual machines, especially in VDI environments. Linked clones are supported on top of redo log based snapshot as mentioned above. Linked clones, however, have the same set of problems mentioned above, without the consolidation problem mentioned above. In addition, as the differencing disk grows with the new data all the space saved using the linked clones is gone. They are mostly used in the VDI environment.
Finally, refcounting based snapshots involves bumping the refcount of each block in the system at the creation (with the optimization as mentioned in the paper) and when the snapshoted file's block is modified, the system does a copy on write or redirect on write depending upon the implementation and layout. Such an implementation does not work well as is for hybrid distributed storage as this would involve syncing the whole write log to disk before taking the snapshot or clone. Also some of some of the solutions only provide volume level snapshots or clones.
Other solutions that do offer file level snapshots/clones have some limitations on the number of snapshots or clones that can be created. In addition, these solutions have the problem that creation or deletion is an expensive operation, which requires more overhead to the storage system.