One conventional data storage system includes two storage processors and an array of disk drives. Each storage processor includes, among other things, a local write cache. The local write caches mirror each other.
During operation, the storage processors perform read and write operations on behalf of one or more external host computers (or simply external hosts). Since the contents of the local write caches are mirrored, the storage processors initially attend to write operations in a write-back manner. That is, the write policy employed by the storage processors involves acknowledging host write operations once the write data is stored in both write caches. By the time the external hosts receive such acknowledgement, the storage processors may not have yet evicted the write data from the write caches to the array of disk drives.
If one of the storage processors fails during operation of the data storage system (e.g., a hardware failure, a software failure, a loss of power to one of the storage processors, etc.), the remaining storage processor vaults the contents of the its local write cache to one or more magnetic disk drives, and then disables its local write cache. The remaining storage processor then flushes the vaulted write cache contents (which are now stored on the magnetic disk drive) to the array of disk drives, i.e., the remaining storage processor empties the vaulted write cache contents by storing the vaulted write data to the array of disk drives.
It should be understood that the remaining storage processor is capable of performing host read and write operations while the remaining storage processor flushes the vaulted write data contents and after such flushing is complete. For example, the remaining storage processor now carries out write operations in a write-through manner where the remaining storage processor stores new write data from an external host to the array of disk drives before acknowledging that the write operation is complete.
It should be further understood that the remaining storage processor vaults the contents of its write cache to the magnetic disk drive and disables its write cache so that a second failure will not result in loss of the cached write data. For example, suppose that the remaining storage processor subsequently encounters a software failure after vaulting the write cache to the magnetic disk drive. When the remaining storage processor recovers from the software failure (i.e., performs a soft reset), the remaining storage processor overwrites its local write cache. In particular, Basic Input/Output System (BIOS) firmware directs the remaining storage processor to clear and test its local write cache. Additionally, the remaining storage processor uses at least a portion of its local write cache for temporarily holding Power-On Self Test (POST) code for running a Power-On Self Test. Although the contents of the local write cache have been overwritten, no write data is lost since the previously-cached write data was immediately vaulted to the magnetic disk drive and since all subsequently received write data is processed in a write-through manner.