The present invention relates generally to data storage systems, and systems and methods to improve storage efficiency, compactness, performance, reliability, and compatibility. Many data storage systems are tasked with handling enormous amounts of data. To protect their data, many organizations use backup systems to store multiple copies of important data on-site and/or off-site. Backup systems can create multiple backup data sets and/or snapshots, enabling organizations to maintain copies of their data at different time periods or instances. Backup systems can also create incremental backups, which record changes in an organization's data subsequent to a previous full or incremental backup data set.
Organizations often prefer to store at least some backup data sets at a different location than their primary data center. This protects the organization's data from accidents or disasters occurring at the primary data center. However, maintaining computers and data storage at multiple locations is expensive and time-consuming. As an alternative, many organizations rely on a third-party to provide off-site data storage. Cloud storage services are one type of off-site data storage. Cloud storage services are data storage services available via a wide-area network. Cloud storage services provide storage to users in the form of a virtualized storage device available via the Internet. In general, users access cloud storage to store and retrieve data using web services protocols, such as REST or SOAP.
Cloud storage service providers manage the operation and maintenance of the physical data storage devices. Users of cloud storage can avoid the initial and ongoing costs associated with buying and maintaining storage devices. Cloud storage services typically charge users for consumption of storage resources, such as storage space and/or transfer bandwidth, on a marginal or subscription basis, with little or no upfront costs. In addition to the cost and administrative advantages, cloud storage services often provide dynamically scalable capacity to meet its users changing needs.
Despite the cost and administrative advantages of cloud storage services, integrating cloud storage services with backup systems can be challenging. First, many backup systems require a specialized backup server at each backup site to store and maintain backup data sets. However, cloud storage services often only provide a virtualized data storage device to their users. Adding and maintaining a backup server at the cloud storage site, for example as a physical server or within a virtual machine, increases the cost and complexity of the cloud storage of the cloud storage service. Additionally, because the wide-area network typically has much lower bandwidth and higher latency than local-area networks, access to backup data sets in the cloud storage service is much slower. This can make some operations too slow to be practical. For example, using cloud storage services to create a synthetic backup, which is a complete backup data set created by copying data from two or more previous backup data sets, including at least one incremental backup data set, is extremely slow due to the performance of the wide-area network.