Conventional systems address potential double fault conditions in a variety of ways. A conventional controller enters an NVSRAM interrupted write mode condition and the owning controller is rebooted by the test. Due to the controller reboot (i) the controller firmware will regenerate parity for the data stripe involved in a write, and (ii) a forced transfer to the surviving controller takes place.
The two conditions above cause the next N writes to be implemented using old/new data to generate a new parity bit. While performing the previous tasks associated with the next N write cycles, a data drive can fail unexpectedly in a volume group before the host retries the write. In such a condition, the controller does not know whether the write completed to the data drive and/or parity drive. If the write completes to the data drive, but does not complete to the parity drive, or vice versa, a potential data corruption will be detected due to the inconsistency between data and parity.
It would be desirable to implement a method and/or apparatus for handling interrupted writes using multiple cores that avoids data corruption.