Today's computer networks include vast amounts of storage, require high data throughput, and demand high data availability. Many networks support hundreds or even thousands of users connected to them. Many networks store extremely valuable data, such as bank account information, personal medical information, databases whose unavailability equates to huge sums of lost revenue due to inability to sell a product or provide a service, and scientific data gathered over large amounts of time and at great expense.
A typical computer network includes one or more computers connected to one or more storage devices, such as disk drives or tape drives, by one or more storage controllers. One technique for providing higher data availability in computer networks is to include redundant components in the network. Providing redundant components means providing two or more of the component such that if one of the components fails, one of the other redundant components continues to perform the function of the failed component. In many cases, the failed component can be quickly replaced to restore the system to its original data availability level. For example, some network storage controllers include redundant hot-pluggable field replaceable units (FRUs), commonly referred to as blades. If one of the blades fails it may be replaced with a good blade while the system is still running to restore the storage controller to its original data availability level.
Another technique employed in storage controllers is modularity. A modular storage controller comprises multiple modules or FRUs such that one or more of the modules may be replaced without replacing all the modules of the storage controller. An advantage of modularity may be increased performance in a cost effective manner. For example, the RIO RAID controller sold by Dot Hill Systems Corp. of Carlsbad, Calif., formerly Chaparral Network Storage, Inc., is a redundant modular storage controller.
FIG. 1 illustrates a RIO RAID controller 100 in a common configuration. The RIO controller 100 includes a backplane 108 including four local buses 112, denoted 112A, 112B, 112C, and 112D. In one version of the product, the local buses 112 are PCI-X buses. The RIO RAID controller 100 also includes four modules, or blades, which are hot-pluggable into the backplane 108: two Data Manager (DM) blades 114, denoted DM-A 114A and DM-B 114B, and two Data Gate (DG) blades 116, denoted DG-A 116A and DG-B 116B. Each of the blades 114 and 116 is a field-replaceable unit (FRU). Each DG blade 116 includes two I/O controllers 126, denoted 126A and 126B. Each I/O controller 126 includes two I/O ports 128, such as FibreChannel (FC) ports, for connecting to host computers and disk drives. Each of the four I/O controllers 126 also has a local bus interface for interfacing with a different one of the local buses 112. On one version of the RIO RAID controller 100, the I/O controllers 126 are JNIC-1560 Milano dual channel FibreChannel to PCI-X controllers. Each DM blade 114 includes a CPU 124, a memory 122, and a memory controller/bridge circuit 118 for interfacing the CPU 124 and memory 122 with two of the local buses 112. In the RIO RAID controller 100 of FIG. 1, DM-A 114A is connected to local bus 112A and 112B, and DM-B 114B is connected to local bus 112C and 112D. I/O controller 126A of DG-A 116A is connected to local bus 112A, I/O controller 126B of DG-A 116A is connected to local bus 112C, I/O controller 126A of DG-B 116B is connected to local bus 112B, and I/O controller 126B of DG-B 116B is connected to local bus 112D.
The I/O controllers 126 function as target devices of the CPUs 124. In particular, the I/O controllers 126A of DG-A 116A and DG-B 116B are controlled by DM-A 114A, and the I/O controllers 126B of DG-A 116A and DG-B 116B are controlled by DM-B 114B. Each of the I/O controllers 126 generates an interrupt request (IRQ) 134 that is routed through the backplane 108 to its respective controlling CPU 124. The I/O controllers 126 receive I/O requests from the host computers on their respective I/O ports 128 and in response generate an interrupt request 134 to notify the CPU 124 of the I/O request. Additionally, each of the I/O controllers 126 may generate an interrupt request 134 to notify its respective CPU 124 that it has received a packet of data from a disk drive or transmitted a packet of data to a disk drive or host computer. The memory 122 caches data from the disk drives for more efficient provision to the host computers. The CPU 124 performs RAID functions, such as performing logical block translation, striping, mirroring, controlling parity generation, processing I/O requests, data caching, buffer management, and the like.
An advantage of a modular approach such as that of the RIO RAID controller 100, is that it provides an architecture for cost effective upgrades to the storage controller 300. For example, in some versions of the RIO RAID controller products, the customer may incrementally add or delete DG blades 116 from the configuration based on connectivity and data availability requirements, such as based on the number of host computers and disk drives to be connected. Additionally, the architecture potentially provides the customer the ability to migrate in technology. For example, a subsequent DG blade could be provided that uses a different interface technology other than FibreChannel, such as SCSI, Infiniband, SATA, iSCSI, etc. Advantageously, the DM blades 114 would not have to be replaced (although a firmware upgrade of the DM blades 14 might be required) in order to enjoy the benefit of the migration in I/O interface technology. Also, the architecture facilitates higher density in 1 U high 19″ rack-mountable enclosures.
FIG. 2 illustrates a scenario in which DM-A 114A has failed. DM-B 114B detects that DM-A 114A has failed via loss of a heartbeat signal 134A from DM-A 114A. When DM-B 114B detects that DM-A 114A has failed, DM-B 114B performs an active-active failover operation to take over processing I/O requests from the host computers previously serviced by DM-A 114A. This is possible because in a typical configuration DM-B 114B is able to communicate with all of the disk drives—including the disk drives comprising the logical units, or disk arrays—previously controlled by now failed DM-A 114A and because in a typical configuration the host computers are capable of issuing requests to the RIO RAID controller 100 via an alternate path, namely through one of the I/O ports 128 connected to surviving DM-B 114B, as discussed below.
Unfortunately, as may be observed from FIG. 2, the I/O ports 128 previously owned by failed DM-A 114A, namely the I/O ports 128 of the I/O controllers 126A of each of DG-A 116A and DG-B 116B, are now inaccessible by DM-B 114B since DM-B 114B has no local bus 112 path to the I/O controllers 126A. Consequently, the I/O ports 128 of the I/O controllers 126A not connected to the surviving DM-B 114B are unused, and are referred to as “orphaned” I/O ports.
There are disadvantages of incurring orphaned I/O ports. In a typical configuration, prior to the failure, DM-A 114A is responsible for servicing I/O requests from some of the host computers to transfer data with some of the disk drives, and DM-B 114B is responsible for servicing I/O requests from the rest of the host computers to transfer data with the rest of the disk drives. In the worst case scenario, the host computers and/or disk drives previously serviced by DM-A 114A are not also connected to the non-orphaned I/O ports 128 (I/O ports 128 of the I/O controllers 126B connected to DM-B 114B), or the host computers previously serviced by DM-A 114A are not configured to use multi-pathing (discussed below), resulting in a loss of data availability.
In the best case scenario, the host computers and disk drives previously serviced by DM-A 114A are connected to the non-orphaned I/O ports 128, thereby enabling DM-B 114B to function in a redundant manner with DM-A 114A to tolerate the failure of DM-A 114A. In this scenario, in response to detecting the failure of DM-A 114A, DM-B 114B resets DM-A 114A via a reset line 132B, and services I/O requests from the host computers previously serviced by DM-A 114A via the non-orphaned I/O ports 128. DM-B 114B may service I/O requests from the host computers previously serviced by DM-A 114A by causing the non-orphaned I/O ports 128 to impersonate the orphaned I/O ports 128. DM-B 114B may cause the non-orphaned I/O ports 128 to impersonate the orphaned I/O ports 128 in two ways: DM-B 114B may cause the non-orphaned I/O ports 128 to change their personality to the orphaned I/O ports' 128 personality, or DM-B 114B may cause the non-orphaned I/O ports 128 to add to their current personality the orphaned I/O ports' 128 personality.
Each of the I/O ports 128 has a unique ID for communicating with the host computers and disk drives, such as a unique world-wide name on a FibreChannel point-to-point link, arbitrated loop, or switched fabric network. The first impersonation technique—causing the non-orphaned I/O ports 128 to change their personality to the orphaned I/O ports 128 personality—operates as follows. When DM-B 114B detects that DM-A 114A has failed, DM-B 114B reprograms one or more of the non-orphaned I/O ports 128 to communicate using the unique IDs previously used by the orphaned I/O ports. Consequently, the reprogrammed non-orphaned I/O ports 128 appear as the orphaned I/O ports, thereby continuing to provide data availability to the host computers and/or disk drives.
The second impersonation technique—causing the non-orphaned I/O ports 128 to add to their current personality the orphaned I/O ports 128 personality—is referred to as “multi-ID” operation. When DM-B 114B detects that DM-A 114A has failed, DM-B 114B reprograms the non-orphaned I/O ports 128 to communicate using not only their previous unique IDs, but also using the unique ID of the orphaned I/O ports. Consequently, the non-orphaned I/O ports 128 appear as the orphaned I/O ports, thereby continuing to provide data availability.
However, there are problems associated with each of these techniques. First, neither of the techniques is transparent to the host computers. That is, each technique may require the host computers to have the capability to begin transmitting I/O requests along a different path to the non-orphaned I/O ports 128, a technique referred to as “multi-pathing.” Furthermore, multi-ID operation is not supported in the FibreChannel point-to-point configuration, and for some users it is desirable to connect the host computers in a FibreChannel point-to-point configuration, rather than in an arbitrated loop or switched fabric configuration. Additionally, some FibreChannel switches do not support arbitrated loop mode, but only support point-to-point mode, with which multi-ID operation may not be used.
A still further problem with orphaned I/O ports is that data throughput is lost even assuming the surviving DM blade 114 is able to failover via non-orphaned I/O ports 128. During normal operation, the DM blades 114 and DG blades 116 operate in an active-active manner such that data may be transferred simultaneously between all the I/O ports 128 along all the local buses 112 and the memory 122, resulting in very high data throughput. However, a reduction in throughput may be a consequence of some of the I/O ports 128 being orphaned.
Therefore, what is needed is an apparatus and method for the surviving DM blade 114 to adopt the orphaned I/O ports 128.