FIG. 1 shows a block diagram of a typical multiple-controller RAID system 100 (RAID is an acronym for "Redundant Array of Independent Disks"). Each host computer 102 is connected to a respective RAID controller 104 through either a Fibre Channel or SCSI bus 106 via a host bus adapter (HBA). Each RAID controller 104 coordinates reading and writing requests from a respective host 102 directed to a shared set of storage devices 108 to which the RAID controllers 104 are connected via a backend Fibre Channel or SCSI disk bus 110. The controllers 104 use the same storage devices 108 so that each host computer 102 can access the same data. FIG. 1 shows only two controllers; however, the illustrated architecture is extendable to systems of N controllers (where N is an integer greater than 2). The controllers 104 have cache memories 112 in which they temporarily store the data most recently read and written by the host 102. The operation of these cache memories 112 is now described with reference to FIG. 2.
FIG. 2 shows a block diagram of the caches 112, which include a read cache 114, a write cache 116 and a write mirror cache 118. A controller 104i (where "i" represents any integer) places write data 103 (FIG. 1) from the host 102 into its write cache 116i and data 105 (FIG. 1) read from the controller 104 by the host 102 into its read cache 114i. Each write mirror cache 118i duplicates the contents 107j of another controller's write cache 116j. The write mirror cache 118i is written to by a controller 104j around the time it initiates a write operation. The write mirror caches 118 allow a duplicate copy of the write data 107 to be stored in a second controller so that a failure of either controller 104 will not result in the loss of data.
Data 107 for the write mirror caches 118 is transferred between the controllers through the backend SCSI or Fibre Channel disk busses 110. The data in a mirrored cache 118 is used only if a controller 104 involved in a write fails, in which case the mirrored data is transferred to the disks 108 for storage.
The problem with this method is that the caches may not be synchronized, which can cause the hosts to receive inconsistent data following read operations. For example, if the host controller 104-1 performs a write to a disk device 108 and the second host system 102-1 attempts to read the same data, a copy of which is already in the read cache 114-2 of the second controller 104-2, the second host would receive state data as the read caches are not updated across controllers. Further, copying all read data across the controllers would severely compromise performance. This problem will become increasingly important as clustering environments increase in popularity.