A variety of network-attached and SAN (Storage Area Network) based storage systems exist for allowing data to be stored on an Ethernet or other IP based network. Typically, these systems include one or more storage controllers, each of which controls and provides network-based access to a respective array of disk drives. Each storage controller typically includes a buffer or cache memory that is used to temporarily store data as it is transferred between the network and that controller's disk drives. For example, incoming data packets containing I/O (input/output) write data may be maintained in a buffer of the storage controller until successfully written to the appropriate disk drives.
Some storage systems implement a storage controller failover mechanism to protect against the possible failure of a storage controller. For example, in some systems, two storage controllers may be paired for purposes of providing controller redundancy. When one of these paired storage controllers detects a failure by the other, the non-failing storage controller may take control of the failing controller's disk drives, allowing these disk drives to be accessed via the network while the failing storage controller is replaced.
To provide such redundancy, one storage controller may maintain or have access to a mirrored copy of the other storage controller's cache and configuration data. This allows the non-failing storage controller to effectively pick up the workload of the failing controller where the failing controller left off. Upon replacement of the failing controller, a synchronization or “rebind” operation may be performed between the non-failing and new storage controllers to copy over the cache and configuration data needed to bring the new storage controller on line.
One significant problem with existing storage system designs is that the mechanism used to provide storage controller redundancy typically adversely affects or limits the performance of the storage system. For example, in some designs, the mechanism used to maintain a redundant copy of a storage controller's cache data limits the rate at which the storage controller can process network traffic and perform input/output operations. In one such design, described in U.S. Pat. No. 5,928,367, the respective memories of two separate controllers are updated synchronously (in lock step); as a result, if a write operation to one of these memories cannot immediately be performed, the corresponding write operation to the other memory generally must also be postponed.
In addition, in many designs, some or all of the system's disk drives cannot be accessed while a rebind operation is being performed between the non-failing and new storage controllers. The present invention seeks to address these and other limitations in existing designs.