While redundant array of independent disks (RAID) systems provide data protection against disk failure, direct attached storage (DAS) RAID controllers are vulnerable to server failure. Since a DAS RAID controller is typically embedded inside a respective server, the controller inevitably fails or is disabled when the server fails. Multiple-node or multiple-server high availability (HA) DAS RAID configurations can be used to provide additional protection against server failure.
In multiple-node data storage systems, when one node or server fails, another server takes over the virtual volume that was being served by the failed server. However, the new server typically lacks information about whether or not the last write operation was successfully completed by the failed server. In cases where the last write operation was not completed by the failed server, an inconsistency (sometimes referred to as a “write hole”) occurs when data and parity for the respective operation are only partially updated (e.g. new data+old parity). Data corruption can result if the new server starts processing new data transfer (IO) requests while the array is in an inconsistent state.