These days, many techniques are used to improve storage utilization by performing reduction of the number of bytes that are stored therein. A data received is usually compared to the data that already exists in storage for the purpose of identifying duplicate copies of data. This may be done in real-time, which is typically resource intensive, since such a comparison requires comparison with the entire data stored. Alternatively, data is stored and then, through a process known as “dedup” (de-duplication) copies of the same data are removed so as to maintain a single copy that is referenced for use in potentially multiple ways. Another way to reduce the size of data is to perform compression using lossless compression techniques.
While the challenge of storage is significant, the difficulty is compounded in storage systems which need to allow a user to “go back in time” and retrieve data previously stored by users of the data that has subsequently been changed. As it is inefficient to store every change of the data at any given point in time (e.g., every 10 seconds), data snapshots are typically taken.
A data snapshot is taken at a particular point in time and saved in storage. On one hand, snapshots typically require less data than continuously storing all of the data. On the other hand, however, snapshots provide only for discrete points of return which need to be taken or scheduled actively. That is, any data existing only between two snapshot points cannot be retrieved. As a non-limiting example, a snapshot of a computer file system is taken daily at 08:00 and at 20:00. Thus, the file system can be recovered only with respect to these points in time on any particular day. That is, if a failure occurs at 17:00, then the most recent snapshot that can be utilized is the one taken at 08:00. As a result, any data saved to the file system between 08:00 and 17:00 would be lost.
It would therefore be advantageous to provide a solution for efficient time continuum data retrieval.