1. Field of the Invention
The present invention relates to reliable electronic systems. More particularly, but without limitation, the present invention relates to highly reliable computer disk drive memory systems, wherein reliability is obtained through the use of redundant components.
2. Description of Related Art
Various types of computer memory storage units are used in data processing systems. A typical system may include one or more disk drives (e.g., magnetic, optical or semiconductor) connected to the system""s central processing unit (xe2x80x9cCPUxe2x80x9d) through respective control devices for storing and retrieving data as required by the CPU. A problem exists, however, if one of the subsystems within the storage unit fails such that information contained in the storage unit is no longer available to the system. Such a failure may shut down the entire data processing system.
The prior art has suggested several ways of solving the problem of providing reliable data storage. In systems where data records are relatively small, it is possible to use error correcting codes (xe2x80x9cECCxe2x80x9d) which are appended to each data record within a storage unit. With such codes, it is possible to correct a small amount of data. However, such codes are generally not suitable for correcting or recreating long records which are in error, and provide no remedy at all for the complete failure of an entire disk drive, for example. Therefore, a need exists for providing data reliability external to individual disk drives.
Redundant disk array systems provide one solution to this problem. Various types of redundant disk array systems exist. In a paper entitled xe2x80x9cA Case for Redundant Arrays of Inexpensive Disks (RAID)xe2x80x9d, Proc. ACM SIGMOD, June 1988, Patterson et al., cataloged a number of different types of disk arrays and defined five distinct array architectures under the acronym xe2x80x9cRAID,xe2x80x9d for Redundant Array of Inexpensive Disks.
A RAID 1 architecture involves the use of duplicate sets of xe2x80x9cmirroredxe2x80x9d disk drives, i.e., keeping duplicate copies of all data on pairs of disk drives. While such a solution partially solves the reliability problem, it almost doubles the cost of data storage. Also, once one of the mirrored drives fails, the RAID 1 architecture can no longer withstand the failure of a second mirrored disk while still maintaining data availability. Consequently, upon the failure of a disk drive, the RAID 1 user is at risk of losing data.
Such systems as those described above have been designed to be easily serviceable, so as to help minimize the total amount of time required for detecting the failed disk drive, removing and replacing the failed drive, and copying data from the remaining functional disk to the replacement disk to again provide a redundant storage system. Nevertheless, in some circumstances where a customer detecting a failed disk drive must secure the assistance of a service engineer, the time elapsed from detection of the failure to complete data redundancy can be as long as twenty-four hours or more. During all this time, the user is exposed to the possibility of data loss if the sole remaining mirrored disk drive fails.
In an attempt to reduce this xe2x80x9cwindow of vulnerability,xe2x80x9d some manufacturers have equipped their storage array disk drive products with a spare disk drive. The spare disk drive is not used during normal operation. However, such systems are designed to automatically detect the failure of a disk drive and to automatically replace the failed disk drive with the spare disk drive. As a practical matter, replacement usually occurs by automatically turning off the failed drive and logically replacing the failed drive with the spare drive. For example, the spare drive may be caused to assume the logical bus address of the failed drive. Data from the functioning disk is then copied to the spare disk. Since this automatic failure detection and replacement process can typically be accomplished within a fairly short period of time (on the order of minutes), the window of vulnerability for automated systems is greatly reduced. Such techniques are known in the disk drive industry as xe2x80x9chot sparing.xe2x80x9d
Immediately following the hot sparing process, the disk array system, although fault tolerant, can no longer sustain two disk failures while maintaining data availability. Therefore, the degree of fault tolerance of the system is compromised until such time as the customer or a service engineer physically removes the failed disk drive and replaces the failed disk drive with an operational disk drive.
As previously mentioned, in addition to RAID 1, there are also RAID levels 2-5. Although there are significant differences between the various RAID levels, each involves the technique of calculating and storing encoded redundancy values, such as hamming codes and parity values, for a group of disks on a bit-per-disk basis. With the use of such redundancy values, a disk array can compensate for the failure of any single disk simply by reading the remaining functioning disks in the redundancy group and calculating what bit values would have to be stored on the failed disk to yield the correct redundancy values. Thus, N+1 RAID (where N=total number of disks containing data in a single redundancy group) can lose data only if there is a second disk failure in the group before the failed disk drive is replaced and the data from the failed drive recreated on the replacement disk.
Redundant disk storage increases overall data availability for data stored on the memory system. However, failure of other parts of the memory system can also compromise data availability. For example, failure of the power, cooling or controller subsystems forming part of the computer memory storage unit may cause stored data to become unavailable.
Redundant power systems are known wherein a single disk array is provided with two power supply subsystems, each being capable of powering the entire array. During normal operation, each power supply supplies one-half of the overall power requirements of the array. However, upon the failure of either power supply, the remaining power supply provides all power to the array until the failed power supply is replaced.
Similarly, redundant cooling systems are also known. For example, two fans may normally cool the entire disk array system. Upon the failure of either fan, the rotational speed of the remaining fan increases so that the remaining fan maintains the temperature of the system within tolerable limits until such time as the defective fan is replaced.
Array Technology Corporation of Boulder Colorado has offered dual controller RAID storage systems to the commercial market. Upon the failure of one of the dual RAID controllers, the host CPU can continue to access data stored on any disk of the array through the other controller. Thus, the Array Technology Corporation disk array system can tolerate the failure of a single controller without compromising data availability.
During recent years, the cost of the physical components for disk drive systems has been decreasing. However, the cost of labor, and in particular the cost for service, is increasing and can be expected to continue to increase. In fact, over the commercially useful life of a disk drive system (typically about 5-10 years), service costs can be expected to meet or exceed the initial purchase price of the system.
Many highly available redundant disk storage systems are designed such that the components which are subject to failure can be easily removed and replaced, either by the customer or a field engineer. Unfortunately, however, building a disk storage system wherein components are serviceable significantly increases the design and manufacturing costs and hence the cost to the customer. For example, serviceable components must be built with more expensive blind mateable plugs and sockets for electrically interconnecting parts wherein such connectors are not easily accessible, for highest availability the overall system must be designed to allow removal and installation of such components without shutting down the system, power interlocks must be installed, etc. It is well known to computer engineers that building such serviceable systems increases the cost of design and manufacture.
In view of the above, it is clear that there exists the need for a computer memory storage unit which has at least the reliability of current highly reliable memory systems, but which, from a commercial standpoint, never needs repair. If the need for repair during the commercially useful life of the product can be eliminated, then the product, even if more expensive to purchase initially, could still provide the customer with a total lower cost of ownership. The present invention fills this need.
In the commercial disk array storage system arena, relatively high reliability is routinely achieved in all memory subsystems, including data storage, power, cooling, controllers and interfaces. Although highly reliable, components nevertheless do occasionally fail and, as previously mentioned, service costs are increasing and are expected to continue to increase. However, by using: a) redundant and/or spare subsystems; b) automatic fault detection; c) mechanisms for automatically swapping a redundant or spare subsystem for a failed subsystem; and (d) where appropriate, circuits which automatically compensate for the failure of any one of the multiple redundant subsystems, computer disk storage memory units can be manufactured which may be expected (statistically) to experience a failure which renders data unavailable only on the order of about once in one million years. In other words, if one million such disk-based memory units were sold, only one will be expected to fail in such a way as to make data unavailable each year during the commercially useful lifespan of the product. More importantly, the manufacturer can therefore reasonably guarantee its customers with a very high degree of certainty that, during some commercially reasonable period of time, the units simply will not fail and that data stored thereon will be continuously available.
Typical disk drive memory systems can be broken down into several functional subsystems, including a disk subsystem, power subsystem, cooling subsystem and controller subsystem. For present purposes a xe2x80x9cfunctional subsystemxe2x80x9d is defined to mean any group of components subject to failure during normal operation and during the commercially anticipated useful lifetime of the overall unit. For example, a disk drive is a functional subsystem subject to failure, whereas the cabinet enclosing the disk drive is not.
From a device standpoint, the present invention achieves the goal of continuously available data by making all functional subsystems of the disk storage system redundant. This includes the provision of backup and spare subsystems for each primary functional subsystem, fault detectors capable of detecting and/or compensating for a failure in either the primary or backup subsystems and, where necessary, circuits for automatically swapping into service one of the two remaining subsystems for a failed one of the redundant subsystems.
From a method standpoint, the present invention includes the steps of providing a storage system (such as a magnetic disk array) including primary functional subsystems, each of which has a backup and spare subsystem. Upon the failure of any primary subsystem, the failure may be detected and, substantially immediately, the backup system takes over the function of the failed primary subsystem. The spare subsystem is then integrated into the overall system to take over the functions of the redundant backup subsystem. Alternatively, or in addition, the invention includes the steps of increasing the output of the remaining functional subsystems to compensate for the failure of any of the primary, backup or spare redundant subsystems.
Disk array controller subsystems direct the storage and retrieval of data to the disks of the disk array and also perform an interface function between the host CPU and the array of disks providing memory to such host CPU. According to the present invention, the controllers are provided in triplicate, i.e., primary, backup and spare controllers. During normal operation, the primary controller directs the writing and reading of data to and from the disk array in the conventional manner. The backup and spare controllers are connected for communication with the host CPU and, upon detection of a failure of the primary controller by the host CPU, the host CPU directs the backup controller to take over the control and interface function. Similarly, upon detection by the host CPU of a failure of the backup controller, the spare controller is directed to assume the array control and interface functions.
Unlike the disk and controller subsystems, which are digital in nature, some systems, such as cooling and power, are analog in nature. According to the present invention, the analog systems are also provided in triplicate (i.e. primary, backup and spare). During operation, these may be used in such a manner that one of the three subsystems provides the entire required function and, upon failure of the primary subsystem, the backup subsystem is swapped in for the primary subsystem and the spare subsystem then takes the place of the backup subsystem.
Alternatively, because of the analog nature of these subsystems, all three power or cooling subsystems may simultaneously share the total load. For example, when all three fans are operational, each provides one-third of the total cooling. Upon the failure of the primary fan, the remaining two fans increase rotational speed so that each provides one-half of the total cooling. Upon a second fan failure, the remaining fan provides all the cooling needs of the system.
Similarly, power supply subsystems can be configured so that each supplies a portion of the total power needs of the computer memory storage unit. Upon the failure of any one or two of the three power supplies, the remaining power supply(s) each supply one-half and then all, respectively, of the total power requirements of the overall memory unit.
The present invention may decrease the total cost of ownership of a computer memory storage unit over the commercial lifetime of the unit. The triple redundancy of all functional subsystems of a disk array system may initially appear to increase cost. However, the present inventor has calculated that, surprisingly, the total cost of ownership of a disk array system can be reduced by the appropriate use of triply redundant components and fault detection and/or compensation circuitry, as described in greater detail hereinafter. The cost savings is achieved and a useful advance in the art realized because the additional costs incurred in integrating triply redundant components into the memory system may be more than offset by the reduction in service calls to customer sites and avoidance of the previously described costs associated with engineering a field-serviceable product.
The above and other advantages of the present invention will become more fully apparent when the following detailed descriptions of the invention are read in conjunction with the accompanying drawings.