This invention relates generally to data storage systems. More particularly, the invention relates to a system, structure, and method for ensuring that a controller in a data storage system that is managed by multiple controllers, has an accurate representation of the system configuration of the data storage system.
Disk drives in all computer systems are susceptible to failures caused, for example, by temperature variations, head crashes, motor failure, controller failure, and changing supply voltage conditions. Modern computer systems typically require, or at least benefit from, a fault-tolerant data storage system, for protecting data in the data storage system against any instances of disk drive failure. One approach to meeting this need is to provide a redundant array of independent disks (RAID).
RAID is a known data storage technology, operated by a disk array controller (controller), that uses several magnetic or optical disk storage devices, known as a disk array, working in tandem to increase disk capacity, improve data transfer rates, and provide higher data storage system reliability in the event of one or more disk storage device failures. However, not only is it desirable for a data storage system to reliably function in the instance that one or more disk storage device failures occur, it is also desirable for the data storage system to reliably function with any type of failed component, including a failed controller. For example, if a controller fails in a single controller system, the entire RAID becomes inoperable. Additionally, although failure of a single controller in RAID being managed by multiple independent controllers (such a RAID system is not shown) will not typically render the entire RAID system inoperable, such a failure will render the tasks that were being performed by the failed controller, and/or those tasks scheduled to be performed by the failed controller, inoperable.
To circumvent the system level reliability problem that all conventional single and multiple independent controller data storage systems exhibit, and to provide fault tolerance to a data storage system at a controller level, data storage systems managed by two controllers in dual active configuration were implemented.
Referring to FIG. 1, there is shown data storage system 124 being managed by two controllers, controllers 126 and 128, in dual active configuration, according to a state-of-the-art. Controller 126-128 manage the RAID, and upon detecting that the other controller 126-128 has failed, will take over the tasks that were being performed by the failed controller 126-128, and perform those tasks that were scheduled to be performed by the failed controller 126-128. In this manner, data storage system 124 provides fault-tolerance at a controller level. The RAID in this example is the disk drive array in peripheral 140, that includes, for example disk drives 134-138.
Controllers 126 and 128 are coupled across first peripheral bus 132, for example, an optical fiber, copper coax cable, or twisted pair (wire) bus, to a plurality of storage devices, for example, disk drives 134-138, in peripheral 140. Controllers 126 and 128 are also coupled across second peripheral bus 142, for example, an optical fiber, copper coax cable, or twisted pair (wire) bus, to one or more host computers, for example, host computer 144.
A first processor (not shown) in controller 126 is coupled to memory 146 that is either internal or external to controller 126. Controller 126 maintains in memory 146, a system configuration data structure 150-X and a conventional system configuration update procedure (not shown) that is executable by the first processor. Similarly, a second processor (not shown) in controller 128 is coupled to memory 148 that is either internal or external to controller 128. Controller 128 maintains in memory 148, a system configuration data structure 152-X and a conventional system configuration update procedure (not shown) that is executable by the second processor.
Each respective controller 126-128 has only one respective system configuration data structure 150-X. For example, controller""s 126 system configuration data structure 150-X is illustrated respectively as 150-A and 150-B, only to reflect certain content changes that occur over time in controller""s 126 system configuration data structure 150-X due to the operation of the conventional system configuration update procedure that is discussed in greater detail below. Similarly, controller""s 128 system configuration data structure 150-X is illustrated respectively as 150-C and 150-D, only to reflect certain content changes that occur over time in controller""s 128 system configuration data structure 150-X due to the operation of the conventional system configuration update procedure.
Each respective system configuration data structure 150-X represents aspects of the system configuration of data storage system 124 (xe2x80x9csystem 124xe2x80x9d). Such aspects include, for example, information with respect to the status, structure and relationship of one or more respective components of system 124 with respect to other respective components of system 124.
Such structural information includes, for example, an indication of whether a particular component is a disk storage device 134-138, or a controller 126-128. Such relationship information includes, for example, an indication that a controller 126-128 can communicate with a component over a particular I/O bus, such as, for example, I/O bus 132. Such status information includes, for example, an indication of whether or not a disk storage device 134-138 is active, and therefore, able to process I/O requests from the controller 126-128, or whether a disk storage device 134-138 has failed, and thus, unable to process I/O requests from the controller 126-128. (Such I/O requests include, for example, Small Computer Standard Interface (SCSI) read and write data requests, which are known in the art of computer programming).
Note that system configuration 150-A accurately represents the respective operational status of each disk drives 134-138. Each disk drive 134-138 is illustrated as xe2x80x9cDISK DRIVE NO. (STATUS)xe2x80x9d, for example, DISK DRIVE 134 (ACTIVE)xe2x80x9d, and the like. In particular, system configuration data structure 150-A accurately represents that disk drive 134 has an active status, and accurately represents that disk drives 136-138 each have a respective failed status.
Data storage system""s 124 system configuration (component content (structure), statuses and relationships) can change for any one of a number of reasons. For example, a system configuration can change as a result of: (a) the failure, or malfunction of a disk drive 134-138; (b) the removal or replacement of a disk drive 134-138 in the event that the disk drive 134-138 failed, or was upgraded; and, (c) the moving a particular disk drive 134-138 to a different location in data storage system 124, such that a different I/O bus 132 is used to communicate with the particular disk drive 134-138.
Upon identifying, by a particular controller 126-128, a change in the system configuration of the data storage system 124, the particular controller 126-128 updates its respective system configuration 150-X to reflect the change. (Methods of identifying changes in the system configuration of a data storage system are known in the art of computer programming). Because it is common for a particular controller 126-128 to detect a system configuration change of data storage system 124 without another different controller 126-128 detecting the same change, the particular controller 126-128, upon detecting any such changes, notify each of the other different controllers 126-128 of the change in the system configuration. Upon receipt of such a notification, each of the receiving controllers 126-128 will update their respective system configuration data structure 150-X to reflect the change.
To accomplish such a notification, controller 126 is coupled across cable 130, for example, a fiber optic, copper coax cable, or twisted pair (wire), to controller 128. Cable 130 is used by each respective controller 126 and 128 to perform a number of tasks, including, for example: (a) upon detecting a change in data storage system""s 124 system configuration, to send system configuration updates to the other controller 126-128; and (b) to determine if the other controller 126-128 has failed.
It can be appreciated that for the proper functioning of data storage system 124, it is desirable for each controller""s 126-128 system configuration data structure 150-X to accurately represent, the structure, component relationships, and operational statuses of any components (system configuration) of the data storage system 124. Unfortunately, there is a significant problem with such conventional system configuration data structure 150-X update techniques, because controllers 126-128, upon performing such techniques in an unsynchronized manner, can each end up, with a respective system configuration data structure 150-X that does not accurately represent the system configuration of data storage system 124.
Consider the following example, where disk drives 136 and 138 fail (or are taken offline), and disk drive 134 is active, or online. In this example, controller 126, detects that disk drive 138 has failed, not yet detecting that disk drive 136 has also failed. Controller 126, in response to detecting the failure of disk drive 138, updates its respective system configuration data structure 150-X, as illustrated in 150-B, to reflect the failure of disk drive 138. At approximately the same time, controller 128, detects that disk drive 136 has failed, not yet detecting that disk drive 138 has also failed. In response to detecting the failure of disk drive 136, controller 128 updates its respective system configuration data structure 150-X, as illustrated in 150-C, to reflect the failure of disk drive 136.
In this example, according to the state of the art, controller 126 sends to controller 128 a system configuration update notification (not shown) that includes an indication that disk drive 138 has failed (true), that disk drive 134 is active (true), and that disk drive 136 is active (false). Upon receipt, by controller 128, of the system configuration update notification, controller 128 modifies system configuration 150-X, as illustrated in 150-D, to reflect that disk drive 138 has failed (true) and that disk drives 134 and 136 are active (false).
Such a result occurs because controller 128 erroneously assumes, according to the state of the art, that the system configuration update notification received from controller 126, received after controller 128 has updated system configuration 150-X, supercedes its old information (illustrated in 150-C). Thus, system configuration 150-X will lose the failed status for drive 136, illustrated in 150-C, and will instead reflect an erroneous status of data storage system""s 124 system configuration, illustrated in 150-D. Additionally, controller 126 never learns that disk drive 136 has failed, as illustrated in system configuration 150. Therefore, neither system configuration 150 or 150-X accurately reflects the system configuration of data storage system 124.
Therefore, what is needed, is a system, structure and method for ensuring system configuration data structure 150-X coherency across multiple disk array controllers 126-128 in a data storage system 124, such that any modifications to such system configuration data structures do not lead to inaccurate representations of the system configuration of the data storage system 124.