The disclosed subject matter relates to techniques for distributed storage, including a local storage layer, a distributed storage layer, and a cloud storage layer.
Certain conventional secure data storage solutions can be difficult to use and difficult to maintain. IT hardware and staff can be expensive and fail regularly, and such failures can result in the loss of data. In connection with certain data storage solutions, data can be difficult or impossible to access remotely.
Conventional data storage products can be categorized into “Local Storage.” “Network Storage Servers,” “Web-Based Storage Services,” and “Distributed Storage Systems.” Each category can have relative benefits and drawbacks with regard to providing reliable, scalable, secure and fault-tolerant storage for small- to medium-sized office environments. Local hard disk drives in desktop computers are a common place to store a file. Local storage can provide high performance read/write times and a sense of tangible security, but if a disk fails, is destroyed, or data is accidentally or maliciously overwritten, the data no longer exists. Moreover, when hard drives fill up, users may attempt to manually manage storage space, deleting files or attempting to transfer them to another machine—a process requiring both time and expertise. Manually sharing files with colleagues can create multiple incoherent versions and emailing files can be insecure.
Network storage servers can be used to provide shared storage space for users in an organization. Commercially available network storage servers range from low- to mid-range ‘box in the closet’ approaches (called ‘Network Attached Storage’ or NAS) to high-end, fiber-channel Storage Area Networks (SANs) packed with blade servers and redundant disk arrays (e.g. RAID). Network storage servers can provide high-capacity storage for users to share, but nevertheless can suffer from a number of well-known problems of centralized storage: servers can be broken, tampered with, hacked, and stolen; they can be destroyed in a fire or ruined by coffee; users can still over-write or delete files by accident, and all data is saved in the same way as on a desktop's hard-drive.
While certain techniques are known for ameliorating these problems, including replicating data in a remote location, utilizing redundant disk arrays, and encryption, such techniques can still include various drawbacks, including increased locations from which unauthorized access may occur, increased expense and complication, and reduced speed and convenience. Additionally, although network storage servers provide file sharing and high-capacity storage, they can be expensive to maintain and administer.
Web-based data storage services provide inexpensive means of backing up and storing data over the internet. Services like Amazon S3, Apple iDisk, EMC Mozy, and Pro SoftNet iDrive are examples of such services. Some users may, however, be wary of routinely sending their sensitive information over the Internet, for example due to perceived weaker protection from digital search and seizure for data stored with a third party. While web-based storage can generally serve as a reliable backup service, it can require a constant, fast internet connection, and can be too slow to be considered a realistic alternative for day-to-day file access and storage.
Distributed storage techniques can include storing a file multiple times on multiple machines will spread the burden and the risk of data storage: the more copies of a file exist the less likely it is to be lost. More copies, however, means more places to steal from, so encryption systems can be required for sensitive data or environments. Moreover, certain existing distributed storage systems can provide low levels of reliability and performance.
Accordingly, there is a need for enhanced techniques for distributed storage.