The present invention is directed generally to multi-controller computational systems and specifically to multi-controller computational data storage systems having multiple component ids.
In computational systems storing large amounts of data, multiple or arrays of data storage devices are commonly employed. In such systems, a storage subsystem controller (hereinafter referred to as controller) controls the transfer of data to and from a computer to the storage devices so that the computer sees all of the storage devices as being connected to the controller as one device. The storage subsystem controller and the storage devices are typically called a storage subsystem and the computer the host because the computer initiates the requests for data from the storage devices.
Commonly, the operation of the storage subsystem controller is defined by the Small Computer Systems Interface or SCSI protocol. A SCSI controller assigns a unique identifier or id to each device in the storage subsystem including itself. The id serves at least two purposes, namely the id uniquely defines each SCSI device on the SCSI bus and is used to guide the arbitration process (i.e., the process by which different devices determine which device can have control of the SCSI bus when more than one device requests access at the same time). Thus, the id determines the device""s priority on the SCSI bus. Id 7, as defined by the SCSI standard, has the highest priority (and is usually assigned to the controller) and id 0 the lowest priority. On a 16-bit SCSI bus, id 15 has a priority lower than id 0 and id 8 the lowest priority. As will be appreciated, proper operation of the SCSI protocol requires that each device on the SCSI bus have a unique SCSI id. If there are any duplicate ids on the bus, the devices with duplicate ids are unable to participate in processing SCSI commands.
A multi-controller data storage system 100 is depicted in FIG. 1. Host computer 102 is in communication with two controllers, namely controller A 104 and controller B 108. Controllers A and B 104, 108 are in turn in communication with a plurality of storage devices shown as disks 112a-n. Two controllers 104, 108 are used to provide redundancy and therefore increased reliability of the storage subsystem 100. If the primary controller fails, the redundant controller manages the storage and transfer of data to and from the storage subsystem.
The primary and redundant controller 104, 108 operate in a dual controller configuration. In one dual controller configuration, both controllers 104, 108 operate in an active/passive mode in which both controllers 104, 108 are online but one controller functions as a primary controller to receive read and write requests from the host computer 102 while the other controller functions as a redundant controller (e.g., hot spare). In another configuration, both controllers 104, 108 operate in an active/active mode in which each controller is online, shares SCSI disk channels with the other controller, and functions both as a primary controller and a redundant controller. In the active/passive or active/active modes when a primary or redundant controller fails, the new controller swapped into the dual controller configuration is known as the foreign controller, and the surviving controller as the native controller.
Problems can arise, particularly for activexe2x80x94active controllers, when dual controllers are configured so that they have interconnected SCSI buses. Normally, each controller is assigned a hardware id (either A or B such as by an A/B switch). The hardware id determines the SCSI ids for the corresponding controller. As long as the hardware ids are different, the SCSI ids will be different. During controller installation or replacement, however, installers often fail to switch one of the controllers to a different hardware id, particularly when the controllers are physically separate and/or at different spatial locations. If a controller is generating SCSI traffic when another controller having the same hardware id (and therefore the same SCSI id) is plugged into the shared buses and powered up, disruption of I/O processing can occur, with potentially costly and severe consequences.
These and other problems are addressed by the methodology of the present invention. Generally, one of the controllers, commonly the foreign controller, monitors the communications among the various data storage subsystem components to identify one or more of the ids in use. After one or more of the ids are identified, the monitoring controller avoids the detected id""s, thereby avoiding arbitration conflicts. In this manner, hardware switches or dedicated interconnections (other than buses) between system components are unnecessary, thereby simplifying system installation, repair, or servicing. The methodology is particularly applicable to SCSI storage subsystems.
In one embodiment, a method for avoiding duplicate identifiers in an array system, includes the steps of:
(a) providing first and second controllers, an array of drives and a bus subsystem interconnecting each of the first and second controllers and the array of drives;
(b) obtaining a first identifier (or id) of the first (native) controller by the second (foreign) controller using bus subsystem control signals that are transmitted between the first controller and the array of drives over the bus subsystem; and
(c) avoiding an identifier (or id) of the second controller that is the same as the first identifier.
In another embodiment, a system for avoiding duplicate identifiers in an array system includes:
(a) an array of drives for storing information;
(b) a bus subsystem;
(c) a first controller electrically connected to the array of drives using the bus subsystem, the first controller being associated with a first identifier and used in generating control signals for transmission along the bus subsystem; and
(d) a second controller electrically connected to the array of drives using the bus subsystem, the second controller, upon being booted, monitoring the control signals transmitted between the first controller and at least a first drive of the array of drives over the bus subsystem, the second controller determining a first identifier of the first controller using the control signals.
Id conflicts are avoided by one or more techniques. For example, the second controller can determine whether the first identifier is the same as the second identifier e.g., the default id for the foreign controller). The second controller can also determine whether an identifier of another system component, such as a disk drive, is the same as the second identifier. In either case, the second controller changes the second identifier as necessary.
The identifiers can be identified by any suitable technique. In one implementation, register support, such as a control register and a data register, is used for low level access to the bus subsystem to passively monitor control signals. In SCSI systems, the identifiers can be sampled and stored when the bus subsystem is in the selection or reselection phases. The algorithm can cycle through a predetermined number of iterations, or selection and reselection phases, to provide a high degree of reliability that all pertinent ids have been acquired. Each time through a loop, the new ids are OR""d with previously stored ids.
To ensure that the second controller has traffic to monitor, the first controller (e.g., any active, fully booted and running controller) can generate predetermined or arbitrary commands on the bus subsystem that will cause the bus subsystem to be in the selection or reselection phase. Controllers that are already booted, such as the first controller, are either processing input/output on a channel(s) or the channel(s) are idle. If any channel is idle, the already booted first controller periodically issues an arbitrary or predetermined command to one of the ids on the bus subsystem. The booting second controller can thereby detect activity on the bus subsystem within a predetermined time interval.