A conventional data storage cluster includes multiple data storage nodes and an array of magnetic disk drives, e.g., a redundant array of independent disks (RAID) group. Each data storage node writes host data into the array and reads host data from the array on behalf of one or more host computers. Some arrays include additional magnetic disk drives to hold error correcting checksums, e.g., for RAID 5, RAID 6 or Reed-Solomon RAID.
Each data storage node typically includes a processor and cache memory. When a data storage node receives a write command and host data to be written to the array from a host computer, the processor of that node stores the host data in the array and perhaps temporarily buffers the host data in the cache memory.
If the cluster is configured for write-back caching, the node's processor buffers the host data in the node's cache memory and acknowledges completion of the write command prior to flushing the host data from the node's cache memory to the array of magnetic disk drives. The node's processor may provide a full copy of the host data to a second data storage node so that, if the first data storage node were to fail prior to flushing the host data from its cache memory to the array of magnetic disk drives, the full copy of the host data can be retrieved from the cache memory of the second data storage node thus preventing loss of the host data. The cluster can be further provisioned to withstand a second node failure (e.g., loss of the host data if the second data storage node were to additionally fail) by configuring the node's processor to provide another full copy of the host data to the cache memory of yet another data storage node (e.g., another full copy of the host data to a third data storage node), and so on.