The present application relates generally to an improved data processing apparatus and method and more specifically to mechanisms for enhanced snapshot performance, storage efficiency improvement, dynamic snapshot policy in erasure coded supported object storage environment.
In computer systems, a snapshot is the state of a system at a particular point in time. A snapshot can refer to an actual copy of the state of a system or to a capability provided by certain systems. A full backup of a large data set may take a long time to complete. On multi-tasking or multi-user systems, there may be writes to that data while it is being backed up. This prevents the backup from being atomic and introduces a version skew that may result in data corruption. For example, if a user moves a file into a directory that has already been backed up, then that file would be completely missing on the backup media, since the backup operation had already taken place before the addition of the file. Version skew may also cause corruption with files which change their size or contents underfoot while being read.
One approach to safely backing up live data is to temporarily disable write access to data during the backup, either by stopping the accessing applications or by using the locking application programming interface (API) provided by the operating system to enforce exclusive read access. This is tolerable for low-availability systems, e.g., on desktop computers and small workgroup servers, on which regular downtime is acceptable. High-availability 24/7 systems, however, cannot bear service stoppages.
To avoid downtime, high-availability systems may instead perform the backup on a snapshot a read-only copy of the data set frozen at a point in time and allow applications to continue writing to their data. Most snapshot implementations are efficient such that the time and I/O needed to create the snapshot does not increase with the size of the data set. By contrast, the time and I/O required for a direct backup is proportional to the size of the data set. In some systems once the initial snapshot is taken of a data set, subsequent snapshots copy the changed data only, and use a system of pointers to reference the initial snapshot. This method of pointer-based snapshots consumes less disk capacity than if the data set was repeatedly cloned.
Erasure coding (EC) is a method of data protection in which data is broken into fragments, expanded and encoded with redundant data pieces and stored across a set of different locations or storage media. The goal of erasure coding is to enable data that becomes corrupted at some point in the disk storage process to be reconstructed by using information about the data that's stored elsewhere in the array. Erasure codes are often used instead of traditional redundant array of independent disks (RAID) because of their ability to reduce the time and overhead required to reconstruct data. The drawback of erasure coding is that it can be more processor-intensive, and that can translate into increased latency. Erasure coding can be useful with large quantities of data and any applications or systems that need to tolerate failures, such as disk array systems, data grids, distributed storage applications, object stores and archival storage. One common current use case for erasure coding is object-based cloud storage.
Erasure coding creates a mathematical function to describe a set of numbers so they can be checked for accuracy and recovered if one is lost. Referred to as polynomial interpolation or oversampling, this is the key concept behind erasure codes. In mathematical terms, the protection offered by erasure coding can be represented in simple form by the following equation: n=k+m. The variable “k” is the original amount of data or symbols. The variable “m” stands for the extra or redundant symbols that are added to provide protection from failures. The variable “n” is the total number of symbols created after the erasure coding process. For instance, in a 10 of 16 configuration, or EC 10/16, six extra symbols (in) would be added to the 10 base symbols (k). The 16 data fragments (n) would be spread across 16 drives, nodes or geographic locations. The original the could be reconstructed from 10 verified fragments.