As is known in the art, large host computers and servers (collectively referred to herein as “host computer/servers”) require large capacity data storage systems. These large computer/servers generally includes data processors, which perform many operations on data introduced to the host computer/server through peripherals including the data storage system. The results of these operations are output to peripherals, including the storage system.
One type of data storage system is a magnetic disk storage system. Here a bank of disk drives and the host computer/server are coupled together through an interface. The interface includes “front end” or host computer/server controllers (or directors) and “back-end” or disk controllers (or directors). The interface operates the controllers (or directors) in such a way that they are transparent to the host computer/server. That is, data is stored in, and retrieved from, the bank of disk drives in such a way that the host computer/server merely thinks it is operating with its own local disk drive. One such system is described in U.S. Pat. No. 5,206,939, entitled “System and Method for Disk Mapping and Data Retrieval”, inventors Moshe Yanai, Natan Vishlitzky, Bruno Alterescu and Daniel Castel, issued Apr. 27, 1993, and assigned to the same assignee as the present invention.
As is also known in the art, the interface is typically stored in a cabinet such as described in U.S. Pat. No. 6,914,784 issued Jul. 5, 2005 entitled Data Storage System Cabinet, inventors Chilton et al., assigned to the same assignee as the present invention. As described therein, the cabinet having a plurality of rack mountable chassis. One portion of such chassis has directors and electrically interconnected memory and another portion of such chassis has a plurality of disk drives. The chassis are electrically interconnected to provide the data storage system interface. A first set of the chassis includes a memory and a plurality of directors and a second set of the chassis include the disk drives. The disk drive chassis, which includes an M×N array of disk drives, where M represents the columns, and N represents the rows of the array, have typically been arranged in one of a pair of configurations. A common topology in which the disk drives are interconnected is through a serial interconnect, such as for example a fibre-channel loop, switched-loop, SAS or serial-ATA point-to-point connection. The serial interconnect provides the necessary connection between the host disk controller and the individual disk drives within the disk drive chassis and includes an interconnect control card. The interconnect control card includes associated circuitry that provides the interconnect and includes, but is not limited to: a daisy-chained series of port-bypass circuits (PBC's) in the case of a fibre-channel arbitrated loop; a crossbar-type switch interconnect for making a direct connection between the storage controller and the disk drive in the case of a fibre-channel switched-loop; or a series of one or more expander-type switches in the case of a Serial Attached SCSI (SAS) or a serial-ATA point-to-point network.
One fibre channel loop is described in U.S. Pat. No. 6,571,355, inventor Thomas Linnell, entitled Fibre Channel Data Storage System Fail Over System, issued May 27, 2003, assigned to the same assignee as the present invention, incorporated herein by reference.
In one configuration, a RAID group is organized within a single column M, where members of the RAID group are located at various points along the N dimension. These are typically placed within a single chassis, although they may span different chassis. In high-availability designs, the disk drives are connected to two separate pathways to controllers via a single, shared, or common, backplane or midplane, as shown in FIGS. 1A, 1B and 1C where the disk drives plug into one side of the midplane and the LCC, including power and cooling units, plug into the opposite side of the midplane. This midplane therefore represents a single point of failure for the RAID group. To alleviate this problem, designers have resorted to striping members of the RAID set (herein sometimes referred to as a RAID group), across multiple M channels, as shown in FIGS. 2A, 2B and 2C each with a separate disk drive enclosure and midplane, each bit of each word of data, plus Error Detection and Correction (EDC) bits for each word, being stored on separate disk drives within a different one of the chassis, as indicted. This eliminates the single point of failure, however, it requires that all of the hardware be replicated M times 2 in order to create a high-availability system. To make each chassis less costly on a per drive basis, the number of disk drives is maximized—which leads to high impact failures if a midplane fails—since a member of many different RAID groups is affected (N), thereby exposing more of the system to additional independent faults that could cause an outage.