The present invention relates to methods and systems for enhancing the reliability of storage systems and, in particular, to a method and system that allows storage devices to be directly interconnected with multiple controller cards in order to eliminate the need for a midplane circuit card that represents a single point of failure within the storage system.
With the decrease in cost and increase in capacity of mass storage devices, increase in data-storage and data-access requirements of computer systems and software applications, and increase in the bandwidth of interconnection technologies, such as the fibre channel, large storage subsystems that are interconnected with one or more remote computers via a communications medium are becoming increasingly popular data storage solutions in the computing industry. Storage subsystems may provide far higher data storage capacities than storage devices directly included within computer systems, and provide shared access to large volumes of data to many different remote computer systems. In addition, mass storage subsystems can be centrally located in secure facilities with multiple, independent communications media interconnections, fail-over power generation facilities, and geographic isolation from various natural and man-made hazards in order to provide better security for the stored data.
As modern computer applications and computer systems have grown more dependent on the security of data stored in storage subsystems, much research and development effort has been devoted to improving and enhancing the internal reliability of mass storage subsystems. A powerful technique for enhancing reliability, commonly applied in the development of high-availability storage subsystems, is to identify single points of failures within the storage subsystems and eliminate single points of failure by substituting for a single component a number of redundant components that can assume full operational loads under fail-over conditions due to failure of one of the number of redundant components.
FIG. 1 illustrates redundant interconnection of data storage devices to communications controllers within a storage subsystem. Note that FIG. 1 is a highly simplified representation of a storage system, and omits a great many components not required for illustration of the redundant interconnection of data storage devices to a communications controller. The storage subsystem 100 includes eight data storage devices 102-109, commonly magnetic disk drives. The data storage devices 102-109 are electronically connected to a midplane circuit board 110. The midplane circuit board provides data, control signal, and power interconnection with input/output (xe2x80x9cI/Oxe2x80x9d) controllers and power supplies. Two redundant I/O controller cards 112 and 113 are also connected to the midplane circuit board 110 via data, control signal, and power lines. I/O controller cards 112 and 113 are additionally interconnected with one or more communications media, such as fibre channels, via fibre channel adaptors 114-117. In certain implementations, the two fibre channel adapters on an I/O controller card may serve as redundant connections to a single fibre channel, in other implementations, the two fibre channel adaptors mat serve to daisy chain the data storage subsystem into a larger arbitrated loop. The I/O controller cards 112-113 implement communications protocols and I/O bus protocols to transfer data and commands from the communications media to the data storage devices 102-109 and transfer data and command execution status information from the data storage devices 102-109 to the communications medium. The storage subsystem illustrated in FIG. 1 is highly available because the data stored in the data storage devices 102-109 can be accessed by remote computers after a complete failure of either of the UO controller cards 112 or 113 or failure of either of two communications media interconnecting the data storage devices 102-109 with remote computers. By including redundant I/O controller cards 112-113, the highly available storage subsystem illustrated in FIG. 1 has eliminated a single point of failure present in previous storage subsystems that included only a single I/O controller card for interconnecting the data storage devices of the storage subsystem to a communications medium.
However, consideration of the storage subsystem illustrated in FIG. 1 reveals a remaining point of failure, namely the midplane circuit card 110. The midplane circuit card is a relatively passive device, generally lacking active electronic components and lacking mechanical components other than multi-pin adaptors that mate with complementary adaptors of the I/O controller cards and data storage devices. However, although reasonably reliable, midplane circuit cards can fail for a variety of reasons, including electrical or mechanical damage that may occur during insertion of data storage devices into, and removal of data storage devices from, the storage subsystem. For this reason, designers and manufacturers of highly available storage subsystems have recognized the need for a method and system for eliminating the single point of failure represented by a midplane circuit card within a highly available storage subsystem.
The present invention provides a method and system for enhancing the reliability and availability of a storage subsystem by eliminating the midplane circuit card that electrically interconnects data storage devices within a currently available storage subsystem to I/O controller cards and power supplies. In one embodiment of the present invention, data storage devices are equipped with two adaptors complementary to adaptors directly affixed to I/O controller cards. In this embodiment, data storage devices having dual adaptors can be directly coupled to two I/O controller cards, without the need for an intermediary midplane circuit card. In an alternative embodiment of the present invention, an older style data storage device having a single midplane-circuit-card adaptor is fitted with an I/O adaptor card having a complementary adaptor that mates with the single midplane-circuit-card adaptor of the older style data storage device and having two I/O controller-card adaptors that mate with complementary adaptors directly affixed to two I/O controller cards. In both embodiments, each I/O controller card includes two power adaptors that connect the I/O controller card with one of two power supplies. In both embodiments, a data storage device can receive all necessary data, control signals, and power via either I/O controller-card adaptor and, when both I/O controller cards are operational, receives data, control signals, and power from both I/O controller cards. Thus, in both embodiments of the present invention, the single point of failure represented by a midplane circuit card currently used to interconnect data storage devices with I/O controller cards and power supplies is eliminated, increasing the reliability and availability of the storage subsystem as a whole. Additional design efficiencies, manufacturing efficiencies, and cost benefits may also accrue from elimination of the midplane circuit card.