1. Field of the Invention
This invention relates to fault tolerant power supplies and more particularly to fault tolerant power supplies for supplying electrical power to a redundant array of data storage units.
2. Description of Related Art
A typical data processing system generally includes one or more storage units connected to a Central Processor Unit (CPU) either directly or through a control unit. The storage units store data and programs which the CPU uses in performing particular data processing tasks.
Various types of storage units are used in current data processing systems. A typical system may include one or more large capacity tape units and/or disk drives (magnetic, optical, or semiconductor) connected to the system through respective control units for storing data. In such systems, a problem exists if one of the large capacity storage units fails such that information contained in that unit is no longer available to the system. Often, such a failure will shut down the entire computer system. Such a failure may arise from a defect in the storage unit, or from a fault in the power supply for the storage unit. Therefore, among other things, it is critical for the power supply for such data storage units to be fault tolerant. It is also critical to be able to recover the data stored in a data storage unit if the storage unit fails for any reason, including loss of power.
The prior art has suggested several ways of solving the problem of providing reliable data storage. In systems where records are relatively small, it is possible to use error correcting codes which generate error correction code (ECC) syndrome bits that are appended to each data record within a storage unit. With such codes, it is possible to correct a small amount of data that may be read erroneously. However, such codes are generally not suitable for correcting or recreating long records which are in error, and provide no remedy at all if a complete storage unit fails. Therefore, a need exists for providing data reliability external to individual storage units.
A number of approaches to such "external" reliability have been described in the art. A research group at the University of California, Berkeley, in a paper entitled "A Case for Redundant Arrays of Inexpensive Disks (RAID)", Patterson, et al., Proc. ACM SIGMOD, June 1988, has catalogued five different approaches for providing such reliability when using disk drives as failure-independent storage units. Arrays of disk drives are characterized in one of five architectures, under the acronym "RAID" (for Redundant Arrays of Inexpensive Disks).
A RAID 1 architecture involves providing a duplicate set of "mirror" storage units and keeping a duplicate copy of all data on each pair of storage units.
A RAID 2 architecture stores each bit of each word of data, plus Error Detection and Correction (EDC) bits for each word, on separate disk drives. For example, U.S. Pat. No. 4,722,085 to Flora et al. discloses a disk drive memory using a plurality of relatively small, independently operating disk subsystems to function as a large, high capacity disk drive having an unusually high fault tolerance and a very high data transfer bandwidth. A data organizer adds 7 error detection and correction bits (determined using the well-known Hamming code) to each 32-bit data word to provide error detection and error correction capability. The resultant 39-bit word is Written, one bit per disk drive, on to 39 disk drives.
A RAID 3 architecture is based on the concept that each disk drive storage unit has internal means for detecting a fault or data error. Therefore, it is not necessary to store extra information to detect the location of an error; a simpler form of parity-based error correction can thus be used. In this approach, the contents of all storage units subject to failure are "Exclusive OR'd" (XOR'd) to generate parity information. The resulting parity information is stored in a single redundant storage unit. If a storage unit fails, the data on that unit can be reconstructed onto a replacement storage unit by XOR'ing the data from the remaining storage units with the parity information.
A RAID 4 architecture uses the same parity error correction concept of the RAID 3 architecture, but improves on the performance of a RAID 3 system with respect to random reading of small flies by "uncoupling" the operation of the individual disk drive actuators, and reading and writing a larger minimum amount of data (typically, a disk sector) to each disk (this is also known as block striping). A further aspect of the RAID 4 architecture is that a single storage unit is designated as the parity unit.
A RAID 5 architecture uses the same parity error correction concept of the RAID 4 architecture and independent actuators, but improves on the writing performance of a RAID 4 system by distributing the data and parity information across all of the available disk drives.
All of the RAID architectures use multiple data storage units. As discussed above, in addition to ensuring that the failure of one of the data storage units will not cause a loss of stored data, supplying the power requirements for a plurality of storage units is critical to proper operation of the data processing system. In many data processing systems, the supply of reliable power to the data storage units is ensured by use of redundant power supplies, each of which is capable of supplying the electrical power requirements of all the data storage units.
FIG. 1 and FIG. 1A illustrate such a prior art means for supplying power to an array of storage units. Dual power supplies 1 are summed together by summing circuit 6. Summing circuit 6 includes blocking diodes 2 that allow electrical current to flow in only one direction and a voltage regulator circuit 4. Blocking diodes are necessary to ensure isolation of the two power supplies during normal operation as well as during a failure condition. The blocking diodes would ensure that current from one power supply does not enter the other. Such a condition would occur if the voltage of each power supply were not perfectly identical (a feat that is hard to achieve in practice over a broad range of operating conditions). If there is a power supply failure, such as a short circuit in a part of the power supply distribution circuitry on the anode side of the blocking diodes, none of the storage devices would be effected due to the isolation provided by the blocking diodes. The storage devices would be supplied current by the operational power supply due to the reverse bias condition applied to the blocking diode connected to the malfunctioning power supply. Each power supply 1 is designed such that it has the capability of providing the total required current at the proper voltage for an entire array of storage units 3. Typically, the power supplies 1 will also be responsible for supplying power to array controllers 11 that control the transfer of data between a CPU and each storage unit 3 of the array.
A disadvantage that arises from use of this method of providing power to the array of storage units is that the combined potential power output when both power supplies are functional is twice as great as required to meet both power requirements of the array, since each power supply must be capable of independently supplying all of the required power if the other supply fails. That is, if there are N storage units in an array, each requiring W watts, each power supply must be capable of supplying N.times.W watts. Hence, the total capacity of the power supplies must be 2.times.N.times.W watts. This excess potential is expensive and inefficient.
Furthermore, it is generally essential for the voltage supplied to the data storage units to remain constant. Since the voltage drop of the blocking diodes 2 varies as a function of the current through each diode, and the current requirements of each data storage unit 3 vary as a function of time, the voltage supplied to each storage unit 3 will vary as a function of time unless regulated.
One voltage regulator 4 dedicated to each storage unit 3 is provided to ensure that the proper voltage is maintained at each storage unit 3 (See FIG. 1A). Because voltage regulation circuits generally require a voltage higher than the stable output desired, each redundant power supply must provide a voltage level higher than the voltage level that would be required if local voltage regulation were not needed. Therefore, the redundant power supplies must be larger than would be necessary in the absence of such local voltage regulators within each storage unit. Additionally, each regulation circuit will add to the overall cost of the system.
Alternatively, a sensing circuit that dynamically determines the amount of voltage lost due to the diodes may be used to provide feedback to an active voltage regulator circuit that adjusts the power supply output voltage to compensate for these losses. Such circuits are known. The obvious disadvantage to this approach is the need for additional circuitry.
Therefore, it is desirable to create a fault tolerant power supply system that has a power capability that need not be greater than the electrical power required by the sum of all the storage units during normal operation, and which has a voltage output just sufficient to meet the voltage requirements of the data storage units.