Entities often generate and use data that is important in some way to their operations. This data can include, for example, business data, financial data, and personnel data. If this data were lost or compromised, the entity may realize significant adverse financial and other consequences. Accordingly, many entities have chosen to back up some or all of their data so that in the event of a natural disaster, unauthorized access, or other events, the entity can recover any data that was compromised or lost, and then restore that data to one or more locations, machines, and/or environments.
Increasingly, entities have chosen to back up their important data using cloud based storage. The cloud based approach to backup has proven attractive because it can reduce, or eliminate, the need for the entity to purchase and maintain its own backup hardware. Cloud based storage is also flexible in that it can enable users anywhere in the world to access the data stored in the cloud datacenter. As well, the user data is protected from a disaster at the user location because the user data is stored in the cloud data center, rather than on backup hardware at the user location.
While advantageous in certain regards, the use of cloud based storage has introduced some new problems however. For example, some cloud based storage systems and services require that a user download an entire file from the datacenter to the local user machine before the user can fully access that file. Depending upon the size of the file and the capacity of the communication line connecting the user with the datacenter, this process can be unacceptably long. For example, it can take a significant amount of time to restore a database, mailbox, or virtual machine disk file.
Moreover, there may not be a need to restore the entire file to the local user machine. This circumstance can arise where, for example, it is adequate for the purposes of a user to restore a dataset that is only a subset of a larger dataset. To illustrate, a user may only need to restore a particular email, and does not need to restore the entire mailbox that includes the email.
In light of problems and shortcomings such as those noted above, it would be useful to be able to store a dataset in such a way that individual portions of the dataset are independent of each other. As well, it would be useful to be able to map and track changes associated with the configuration and location of those individual portions so that one or more selected portions can be retrieved, on an individual basis if called for. As well, it would be useful for a requestor to specify which portion or portions of a stored dataset are desired to be retrieved by the requestor. Finally, it would be useful to be able to provide these functions, among others, in a variety of scenarios and use cases, examples of which include disaster recovery, and live access to databases, email repositories such as mailboxes, and other data sources of various sizes and types.