As the amount and the emotional value of data such as digital media stored by consumers grows, so does their need to store this media reliably over extended periods of time. One way to accomplish this is to store their data across a set of distributed storage resources. For example, users may store their data locally on one or more separate devices and/or remotely using one or more hosted storage facilities or services. The advantage of storing data in more than one location is that the data is more likely to be available at the time that it is requested.
Various services exist to synchronize and backup data across multiple storage resources. For example, a distributed virtual disk system has been conceived, known as Petal, that provides for virtualization at the block level and allows a single logical volume to span multiple physical storage devices. However, while this approach would allow implementation of scalable storage by growing the logical volume when new disks are added, it has a number of serious drawbacks. For instance, because file system metadata is not replicated throughout the system, the removal or crash of any physical disk is likely to cause some irrecoverable damage/loss to the metadata. This may cause the entire logical volume to be unusable. Also, a single software error (e.g. a bug in the file system) may suddenly corrupt the entire logical volume. In addition, Petal is not designed to allow a user to selectively specify a desired reliability target at the individual object level (e.g. file or directory), since the virtualization layer of Petal views data in terms of blocks but not files or directories. Finally, if a physical disk is removed and mounted on another system in isolation, the disk will not have a self-contained file system, and thus the information stored on the disk will be unreadable for consumers.
Another distributed file system, known as Farsite, uses multiple physical disks attached to machines interconnected by a local area network (LAN). However, the implementation of Farsite is highly complex; a complete re-implementation of the file system is required. Achieving a high level of reliability for a new file system typically takes several years. In addition, individual physical disks in Farsite may not have a consumer-readable self-contained file system and thus may not be readable in isolation, e.g. when plugged into another machine.
Yet another way to create a distributed file system is to construct a single namespace out of multiple disjoint file system volumes using one of the existing mechanisms in operating systems such as the Microsoft WINDOWS line of operating systems. However, partitioning of the single namespace across the multiple physical disks may be too coarse and make certain operations difficult and inefficient. For instance, if one of the disks becomes full, then the system would need to move an entire directory sub-tree off to another disk. This may consume an inordinate amount of disk bandwidth. It is also difficult to selectively specify the desired reliability target at the individual object level (e.g. the file or directory), given that all files and directories placed on a physical disk typically share those settings. In addition, this approach requires a volume-level mirroring solution (RAID) with all the well-known drawbacks that approach has for consumer scenarios.
Accordingly, new solutions are needed to provide consumers with a way to easily, reliably, efficiently, and flexibly maintain their data.