1. Field of the Invention
The present invention relates to the field of synchronizing operating code and in particular to methods and apparatus for synchronizing operating code in redundant disk storage subsystem controllers in a disk storage subsystem.
2. Discussion of Related Art
Typically, a computer stores data within devices such as hard disk drives, floppy drives, tape, compact disk, etc. These devices are otherwise known as storage devices. If a large amount of data requires storage, then a multiple of these devices are connected to the computer and utilized to store the data. A computer typically does not require knowledge of the number of storage devices that are being utilized to store the data because another device, the storage subsystem controller, is utilized to control the transfer of data to and from the computer to the storage devices. The storage subsystem controller and the storage devices are typically called a storage subsystem and the computer is usually called the host because the computer initiates the requests for data from the storage devices.
The operating code in a storage subsystem controller (hereinafter referred to as controller) executes software that manages the transfer of data to and from disk drives in a disk storage subsystem. A disk storage subsystem often has a redundant controller to increase reliability of the storage subsystem such that if the primary controller fails, the redundant controller manages the storage and transfer of data to and from the storage subsystem. The primary controller and redundant controller can operate in a dual controller configuration.
The dual controller configuration can be operating in active/passive mode, that is, both controllers are online but one controller functions as a primary controller to receive read and write requests from the host computer while the other controller functions as a redundant controller (e.g. hot spare). Alternatively, the dual controller configuration can be operating in active/active mode, that is, both controllers are online and both controllers function as a primary controller and a redundant controller. In active/passive mode or active/active mode when a primary or redundant controller fails, the new controller swapped into the dual controller configuration is known as the foreign controller, and the surviving controller is known as the native controller.
In a dual controller configuration, a difficulty arises when one controller fails. A possibility exists that the failed controller is removed and replaced with a foreign controller (e.g., spare controller) that contains an incompatible software revision and configuration parameters within firmware. The software revision and configuration parameters of the foreign (spare) controller may be older or newer than the native controller or the surviving controller currently operating in the dual controller configuration. Dual controllers (e.g., primary and redundant controllers) operating in a dual controller configuration require that both controllers are compatible to ensure that the foreign controller does not destroy subsystem configurations that are being used by the native controller. Incompatibility is a result of the foreign controller operating a version of software with configuration parameters not compatible to the version of the native controller because of software and configuration parameter updates.
Typically, it may not be possible to determine the software revision level within the controller firmware of the spare controller before placing the spare controller in the disk array storage subsystem. Furthermore, it may not be possible to download software code and configuration parameters to the spare controller firmware that is compatible with the native controller software revision before the spare controller is swapped into the disk array storage subsystem.
It is known in the art for a user to manually modify the spare controller's operating code after the controller has been initialized and brought online. If the spare controller is operating incompatible code, it is known for the user to take the controller offline and manually modify program memory with the operating software and configuration parameters compatible with the surviving controller. However, this interrupts host I/O requests because the spare controller is brought offline and host I/O requests may be lost if they are not rerouted though the redundant disk array configuration software.
The user utilizes application software that loads the operating code that matches that of the native controller into the program memory of the spare controller. Once the program memory modification is complete, but before the spare controller is online, a hardwired reset line between the dual controllers provides an interrupt signal to the spare controller. The spare controller then begins to execute the newly downloaded software. Thus, the reset signal provides a means to resynchronize the configuration information between the spare and surviving controllers. A problem arises with this solution because many newer controllers do not have the hardware to accept interrupts from the hardwired-reset line.
A need exists to automatically synchronize the code between a native and foreign controller when a spare controller is initially swapped into the dual controller configuration to eliminate the need for manually modifying the operating code. A need exists to reduce the interruption and loss of host I/O requests. Similarly, a need exists for eliminating the hardwired-reset line to force an interrupt on the spare controller's processor.