This invention relates to computer systems, and is concerned especially with computer systems of the kind that involve the interconnection or clustering of a multiplicity of processors.
It is known to form a computing system of the above-specified kind by interconnecting the processing units of a multiplicity of personal computers (PCs) and operating them in parallel with one another; such systems are sometimes referred to as xe2x80x98Beowulf clustersxe2x80x99. The central processing units (CPUs) of PCs provide significant computing power at relatively-low cost, and advantage has been taken of this to form systems of the above-specified kind having very high computing power comparable with that of a specially-designated supercomputer, at a fraction of the supercomputer-cost. In such systems a multiplicity of PC-CPUs are interconnected and operated in parallel with one another as separate nodes of a local area network. These systems using clustered CPUs require the development of special software to enable parallel operation, and are generally slower than their supercomputer counterparts, but have significant advantage economically.
The CPUs of PCs are not designed to have the extended reliability to be expected of a supercomputer, so computing systems of the known form involving clustered CPUs are, in comparison, susceptible to faults. A fault occurring in an individual CPU will disrupt processing of the current application, and although the application can in general be re-started without replacement of the faulty unit, the disruption and loss of computing time involved is undesirable.
It is one of the objects of the present invention to provided a computer system of the said above-specified kind, which whilst having the potential for cost advantage of the known clustered PC-CPU systems, in less susceptible to fault disruption.
According to one aspect of the present invention there is provided a computer system of the said above-specified kind wherein power supply to each processor is from a common power-supply means having fault-tolerating redundancy.
The computer system of the present invention may, especially for cost-advantage, utilize processors that are of a form such as used in the context of PC computers. However in accordance with the present invention, rather than powering each processor from its own power-supply unit as in the case of the known form of computer system referred to above utilising PC-CPUs, they are powered from common power-supply means. The power-supply units of PC-CPUs especially, are not designed to have long fault-free operation so the likelihood of a fault arising in any of a multiplicity of clustered PC-CPUs, can be significantly high. The individual power-supply units might be replaced by units with a higher standard of reliability, but it is generally more economical to provide a common power-supply means and invest this with an even higher standard of reliability and, moreover, to include fault-toleration redundancy within it.
The processors of the computer system according to the invention may be carried by individual printed-circuit boards, for example PC motherboards, that are mounted together side-by-side within a rack-mounting. The rack-mounting may be contained within a cabinet together with the power-supply means.
The power-supply means may involve one or more power-supply units each of which comprises a plurality of power-supply modules which operate in parallel with one another in supplying power to the processors. The modules may each include diode or other circuitry that is responsive to the occurrence of a fault within the module (eg reduction in its voltage output in relation to that of the other module) to isolate that module effectively from the processors. Where more than one power-supply unit is involved, they may act in parallel with one another to power all the processors together.