1. Field of the Invention
The invention relates generally to maintaining data recovery. In particular, the present invention relates to data recovery in a distributed redundancy data storage system.
2. Background
In information technology (IT) systems, often data is stored on storage devices with redundancy to protect against component failures resulting in loss of data. Such data redundancy can be provided by simple data mirroring techniques or via erasure coding techniques. Erasure codes are the means by which storage systems are made reliable. In erasure coding, data redundancy is enabled by computing functions of user data such as parity (exclusive OR) or other more complex functions such as Reed-Solomon encoding. A Redundant Array of Inexpensive Disks (RAID) stripe configuration effectively groups capacity from all but one of the disk drives in a disk array and writes the parity (XOR) of that capacity on the remaining disk drive (or across multiple drives). When there is a failure, the data located on the failed drive is recovered by reconstruction using data from the remaining drives.
In monolithic systems (e.g., a controller with two redundant processors where all the storage disks are accessible to both processors), data reconstruction can be more easily managed by one of the processors with a full knowledge of events during the process. Recovery from error or interruption is simplified. However, in a distributed redundancy data storage system including a collection of loosely coupled processing nodes that do not share the same disks, there are many more components, less shared knowledge and many more failure states and events. “Distributed” means that it is a collection of nodes. “Redundant” means that it must have erasure coding.
Most distributed storage systems perform recovery by a centrally designated node, or at a host device or client device. Recovery at a host or client enables recovery and coordination in a manner similar to the centralized approach. Such approaches do not leverage fully the processing power of all the nodes.