A current trend in the design of electrical systems, and especially data storage systems, is to modular system configurations wherein individual electrical units of the system are readily accessible and in some cases customer removable. The use of modular designs provides a number of different advantages. Manufacture and assembly is made simpler in that each unit can be manufactured and tested separately before being assembled in the complete system. Furthermore, if a removable unit becomes defective, it can be readily removed for repair and replaced with a working device. A typical multicomponent system of this type is a computing system in which data storage devices, processing hardware, power supplies and cooling fans are contained within a single support structure.
Although ease of removability of individual devices is facilitated using a modular configuration, the removal and replacement of a device usually requires the system to be closed down thus reducing the amount of time for which the system is available. Systems are beginning to come onto the market which allow for concurrent maintenance of various devices within the system. In this way a defective device can be removed for maintenance or replacement whilst allowing continued operation of remaining elements of the system.
Taking the example of disk file data storage system comprising removable disk files and associated power and cooling units, such systems are currently available which allow for the replacement of one or more of the disk files while maintaining operation of the remaining disk files. Furthermore, EP-A-617 570 describes a data storage system including replaceable cooling and power assemblies; the system housing being configured to permit removal of these assemblies without the need to remove the disk files. Although ease of access to various subassemblies within a modular electronic system is a prerequisite to achieving the desired aim of concurrent maintainability, it is also necessary to build redundancy into the system so that removal of a defective device providing a life support function to the system, e.g., a power supply or cooling assembly does not result in a shortage of power or overheating of the remaining devices. Systems are known which include a redundant array of cooling fans, where N fans are required to cool the system and therefore N+1 are fitted. One such system is described in EP-A-617 570. In the event of failure of one of the fans, the system can continue to operate while the defective fan is removed for repair or replacement.
Fault-tolerance is another desirable goal in today's high availability computer systems and networks. Fault tolerance is especially important in disk storage subsystems to ensure continuous availability of customer data, even in the event of failure of one of the components of the subsystem. Disk failure is catered for by the well known RAID (Redundant Array of Independent Disks) architecture. It is, however, a continuing technical challenge to design modular electronic systems and in particular data storage subsystems which achieve fault tolerance of components and electronics other than the disk drives.