Cloud data storage (CDS) describes data storage available as a service to a user via a network. A typical CDS system comprises storage nodes such as a cluster of interconnected storage servers made available to a client via a network (such as the Internet). In general, the design of CDS systems is governed by three basic considerations or tradeoffs: reliability, locality, and redundancy. First, the system should reliably store the data in a recoverable form such that no data is lost when up to a threshold number (“bounded number” or “bounds”) of storage nodes or machines of the CDS system data center fail or otherwise become unavailable. Second, the data stored in the CDS system should be readily available and recoverable by accessing only a small number of other machines in the system (“locality”) for any combination of CDS system failures that are within the bounds. Third, the system should optimize the overall size (and cost) of storage resources by minimizing the storage of redundant data.
Designing CDS systems that perform well with respect to all three competing considerations poses a substantial challenge. Conventional CDS systems employ a solution based on either replication or Reed Solomon encoding (RSE). The replication approach is where each file is replicated and stored on different machines to yield good reliability and locality but does little to minimize redundancy (thus leading to high costs). The RSE approach, on the other hand, groups pieces of data together into blocks that are encoded using an optimal erasure code (known as the Reed Solomon code or RSC) to yield good reliability and redundancy but, since any data recovery necessarily involves a large number of machines, provides poor locality.
In addition, the nodes or machines of a CDS system are typically organized into clusters that constitute upgrade domains where software and hardware upgrades are applied to all machines in a single domain at the same time, effectively rendering all data stored within that domain temporarily unavailable. For upgrade efficiency, optimal design considerations also require that the number of upgrade domains to be relatively small. Consequently, a significant challenge for a CDS system is placing data (system data and encoded redundant data) onto a small number of upgrade domains in a manner that keeps data available when certain machines are inaccessible due to failures even when an entire domain is inaccessible due to an upgrade.