1. Field of the Invention
The invention relates generally to maintaining data reliability, and more particularly, the present invention relates to parity coherency in data storage.
2. Background
In information technology (IT) systems, often data is stored with redundancy to protect against component failures resulting in loss of data. Such data redundancy can be provided by simple data mirroring techniques or via erasure coding techniques. Erasure codes are the means by which storage systems are made reliable. In erasure coding, data redundancy is enabled by computing functions of user data such as parity (exclusive OR) or other more complex functions such as Reed-Solomon encoding. A Redundant Array of Inexpensive Disks (RAID) stripe configuration effectively groups capacity from all but one of the disk drives in a disk array and writes the parity (XOR) of that capacity on the remaining disk drive (or across multiple drives). When there is a failure, the data located on the failed drive is reconstructed using data from the remaining drives.
When data is updated by a host device, the redundancy data (parity) must also be updated atomically to maintain consistency of data and parity for data reconstruction or recovery as needed. In most cases, such updates can be time consuming, as they usually involve many storage device accesses. To mitigate this effect, a redundancy system may employ a write-back or “fast write” capability wherein one or more copies of new host write data (i.e., host data and one or more copies thereof) are stored/written in independent cache components of the system. The write is acknowledged as complete to the host and the parity updates are delayed to a more suitable time (e.g., at de-stage time of new write data).
In monolithic systems (e.g., a controller with two redundant processors where all the storage disks are accessible to both processors), atomic parity update can be more easily managed by one of the processors with a full knowledge of events during the process. Recovery from error or interruption is simplified. However, in a distributed redundancy data storage system including a collection of loosely coupled processing nodes that do not share the same disks, there are many more components, less shared knowledge and many more failure states and events. Consequently, achieving atomic parity update is more difficult. “Distributed” means that it is a collection of nodes. “Redundant” means that it must have erasure coding. In a write-thru system (without fast write), if a parity update fails prior to acknowledgement of the write to the host, then the write fails and recovery is driven by the host. However, with a distributed redundancy storage system employing fast write, the host data is committed by the distributed redundancy storage system and must be reliably available at any future time. Consequently, the atomic parity update must be managed internally within the distributed redundancy storage system.