Entities often generate and use data that is important in some way to their operations. This data can include, for example, business data, financial data, and personnel data. If this data were lost or compromised, the entity may realize significant adverse financial and other consequences. Accordingly, many entities have chosen to back up some or all of their data so that in the event of a natural disaster, unauthorized access, or other events, the entity can recover any data that was compromised or lost, and then restore that data to one or more locations, machines, and/or environments.
Increasingly, entities have chosen to back up their important data using cloud based storage. The cloud based approach to backup has proven attractive because it can reduce, or eliminate, the need for the entity to purchase and maintain its own backup hardware. Cloud based storage is also flexible in that it can enable users anywhere in the world to access the data stored in the cloud datacenter. As well, the user data is protected from a disaster at the user location because the user data is stored in the cloud data center, rather than on backup hardware at the user location.
While advantageous in certain regards, the use of cloud based storage can present some problems. Some of these problems are related to the way in which data is stored. To illustrate, relatively large files are often backed up in cloud based storage. Because it is typically not feasible to back up an entire new version of the file each time the file is changed, incremental backups can be employed after the baseline backup of the file is performed. The incremental backups reflect only the changed portions of the file. Such incremental backups may tend to accumulate over time because the large size of the baseline file is a disincentive to performing a full backup of all the changes.
If a locally stored version of the file experiences problems, an earlier version of the file can be restored locally using the original backup version and the accumulated incremental backups. While relatively straightforward in principle, this approach to restoration is problematic as a practical matter.
In particular, performance of a full local restore would first require local restoration of the baseline file that was initially backed up. Depending upon the size of the file and the capacity of the communication line connecting the user with the datacenter, this process can be unacceptably long. For example, it can take a significant amount of time, and communication bandwidth, to restore large files such as a database, mailbox, or virtual machine disk file. Once the baseline backup is fully restored, the various incrementals would then have to be applied to that backup in order to locally obtain a recent version of the file. This process, as well, can be quite lengthy. In particular, depending upon the number and size of incrementals, which could span a period of months, or longer, application of the incrementals to the restored baseline may be quite time consuming.
In light of problems and shortcomings such as those noted above, it would be useful to be able to locally restore a file without the necessity of transmitting and restoring the entire baseline backup of the file. As well, it would be desirable to be able to locally restore a particular version of the file. Finally, it would be useful to be able to locally restore a file using information that is based on the incremental backups of that file.