Cloud storage generally refers to a family of increasingly popular on-line services for archiving, backup, and even primary storage of files. Amazon S3 (Simple Storage Service) from amazon.com is a well-known example. Cloud storage providers offer users clean and simple file-system interfaces, abstracting away the complexities of direct hardware management. At the same time, though, such services eliminate the direct oversight of component reliability and security that enterprises and other users with high service-level requirements have traditionally expected.
A number of different approaches to verification of file availability and integrity have been developed in order to restore security assurances eroded by cloud environments. One such approach uses proofs of retrievability (PORs), described in A. Juels et al., “PORs: Proofs of retrievability for large files,” ACM CCS, pages 584-597, 2007. A POR is a challenge-response protocol that enables a prover (e.g., a cloud storage provider) to demonstrate to a verifier (e.g., a client or other user) that a file F is retrievable, i.e., recoverable without any loss or corruption. A POR uses file redundancy within a server for verification. The benefit of a POR over simple transmission of F is efficiency. The response can be highly compact (e.g., tens of bytes), and the verifier can complete the proof using a small fraction of F.
Another approach is based on proofs of data possession (PDPs), described in G. Ateniese et al., “Provable data possession at untrusted stores,” ACM CCS, pages 598-609, 2007. A typical PDP detects a large fraction of file corruption, but does not guarantee file retrievability. Roughly speaking, a PDP provides weaker assurances than a POR, but potentially greater efficiency.
As standalone tools for testing file retrievability against a single server, though, PORs and PDPs are of limited value. Detecting that a file is corrupted is not helpful if the file is irretrievable and thus the client has no recourse.
Thus PORs and PDPs are mainly useful in environments where F is distributed across multiple systems, such as independent storage services. In such environments, F is stored in redundant form across multiple servers. A verifier can test the availability of F on individual servers via a POR or PDP. If it detects corruption within a given server, it can appeal to the other servers for file recovery.
Other conventional approaches provide distributed protocols that rely on queries across servers to check file availability. See, for example, M. Lillibridge et al., “A cooperative Internet backup scheme,” USENIX Annual Technical Conference, General Track 2003, pages 29-41, 2003, and T. Schwarz et al., “Store, forget, and check: Using algebraic signatures to check remotely administered storage,” International Conference on Distributed Computing Systems (ICDCS), 2006. In the Lillibridge et al. approach, blocks of a file F are dispersed across n servers using an (n,m)-erasure code (i.e., any m out of the n fragments are sufficient to recover the file). Servers spot-check the integrity of one another's fragments using message authentication codes (MACs). The Schwartz et al. approach ensures file integrity through distribution across multiple servers, using error-correcting codes and block-level file integrity checks. This approach employs keyed algebraic encoding and stream-cipher encryption to detect file corruptions.
However, the various known approaches noted above are deficient in important respects. For example, none of these approaches adequately addresses the case of what we refer to herein as a “mobile adversary,” that is, one that is capable of progressively attacking storage providers and, in principle, ultimately corrupting all providers at different times.