A storage area network (“SAN”) environment often includes numerous storage devices that are operated by using a dual controller model. In many cases such storage devices can include at least one array of disks, which can be classified as a redundant array of independent disks (“RAID”). Under such dual controller model arrangements, where the controllers are often referred to as High Availability (“HA”) pairs, individual controllers can be assigned to operate as primary controllers or “owners” of various volumes or arrays of storage devices. These controllers can also take over volumes of storage devices from their alternate or paired controllers in the case of failures or other reasons for replacing controllers.
The replacement or swapping out of system controllers in HA pairs is generally well known, and typically involves the replacement of controller heads, NVRAM cards, and/or the entire controller in some instances. Such procedures are sometimes referred to as “headswap,” and often result in significant disruption to the overall operation of at least the HA pair and RAIDS assigned thereto, if not the larger SAN to which the HA and RAIDs may belong. For example, a common approach to headswap involves booting the controller affected by the swap into a maintenance mode and running a disk reassign operation. While effective, this is disruptive in that the storage owned by the affected controller is generally unavailable during the process.
Other approaches to headswap can result in less disruption. For example, a headswap on a controller of an HA pair can involve a takeover of the replaced controller's storage devices by the system controller that is not being replaced. In this manner, the storage volumes and devices owned by the affected controller are taken over by the remaining controller and are at least available during the headswap process. A number of steps are manually performed to swap out the outgoing controller with a newly installed controller. After this swapping out of the old controller is finished, then a manual disk reassign operation is performed, and a giveback of the storage devices from the remaining system controller to the newly installed controller is provided.
Unfortunately, there are several problems that can arise from such a non-disruptive but largely manual process. For example, headswap detection by an HA paired system often depends upon a controller detecting a discrepancy in ownership of an aggregate of storage devices (e.g., an array of storage devices) and the individual devices in the aggregate. Where such aggregate and individual device ownership discrepancies do not exist despite the existence of a headswap, the headswap may then go undetected, which can make headswap detection unreliable in some cases. In addition, a manual headswap procedure may rely upon an accurate input of the new controller system identifier by the user. If any error occurs in this manual system identifier entry process, then the headswap fails and the replacement controller may need to be entirely rebooted. Further, problems can arise when giveback from controller and storage operations are performed at the same time. Further, multidisk panic can occur when disks are reassigned while they are live because a controller may attempt to recover from unknown or inconsistent states that arise during live reassignment.
Although many network storage systems, devices and methods for headswap have generally worked well in the past, there is always a desire for improvement. In particular, what is desired are network storage systems and methods that are able to provide headswap procedures for system controllers in an automated, non-disruptive and reliable manner that overcomes the foregoing problems.