Networks and distributed storage allow data and storage space to be shared between devices located anywhere a connection is available. These implementations may range from a single machine offering a shared drive over a home network to an enterprise-class cloud storage array with multiple copies of data distributed throughout the world. Larger implementations may incorporate Network Attached Storage (NAS) devices, Storage Area Network (SAN) devices, and other configurations of storage elements and controllers in order to provide data and manage its flow. Improvements in distributed storage have given rise to a cycle where applications demand increasing amounts of data delivered with reduced latency, greater reliability, and greater throughput.
To provide this capacity, data storage systems evolved into increasingly complex systems. For example, some storage systems began to utilize one or more layers of indirection that allow connected systems to access data without concern for how the data is distributed among the storage devices. The connected systems issue transactions directed to a virtual address space that appears to be a single, contiguous device regardless of how many physical storage devices are incorporated into the virtual address space. It is left to the storage system to translate the virtual addresses into physical addresses and provide the physical address alone or in combination with the virtual address to the storage devices. RAID (Redundant Array of Independent/Inexpensive Disks) is one example of a technique for grouping storage devices into a virtual address space, although there are many others. In these applications and others, indirection hides the underlying complexity of the storage system from the connected systems and their applications.
RAID and also other indirection techniques implement different recovery mechanisms in the event a storage device fails. In one example, data in the RAID system can be stored in multiple copies on different storage devices. In this way, if one storage device fails data can still be recovered from copies stored on the other storage devices. In another example, RAID system stores a parity function for data in a storage device. The parity function is typically stored on a storage device other than the storage devices that stores data recoverable with the parity function. When a storage device fails, the parity function can be combined with the data stored on the functioning storage devices to recover the data on the storage device that failed. Such approach, however, does not work when storage devices that store multiple copies of data fail, or if storage devices that store data and the parity function fail.