The present invention is generally directed to providing reliable cooling systems for mainframe computer systems or for any electronic system requiring cooling. More particularly, the present invention is directed to a redundant refrigeration system employing a single cold plate which preserves flow isolation between the fluids in the redundant systems. In another aspect of the present invention, there is provided a combination of air and redundant refrigeration cooling for an electronic device such as a mainframe or server processing unit disposed within a cabinet possibly along with other less thermally critical components. In yet another aspect of the present invention, there is provided a modular refrigeration unit capable of operating continuously in a variety of ambient conditions and under a variety of thermal loads.
In recent years, the semiconductor industry has taken advantage of the fact that CMOS circuits dissipate less power than bipolar circuits. This has permitted more dense packaging and correspondingly faster CMOS circuits. However, almost no matter how fast one wishes to run a given electronic circuit chip, there is always the possibility of running it faster if the chip is cooled and thermal energy is removed from it during its operation. This is particularly true of computer processor circuit chips and even more particularly true of these chips when disposed within multi-chip modules which generate significant amounts of heat. Because there is a great demand to run these processor modules at higher speeds, the corresponding clock frequencies at which these devices must operate become higher. In this regard, it should be noted that it is known that power generation rises as a function of the square of the clock frequency. Accordingly, it is seen that the desire for faster computers generates not only demand for computer systems but also generates thermal demands in terms of energy which must be removed for faster, safer and more reliable circuit operation, run, thermal energy is the single biggest impediment to semiconductor operation integrity.
In addition to the demand for higher and higher processor speeds, there is also a concomitant demand for reliable computer systems. This means that users are increasingly unwilling to accept down time as a fact of life. This is particularly true in the mainframe and server realms. Reliability in air-cooled systems is relatively easily provided by employing multiple air-moving devices (fans, blowers, etc.). Other arrangements which incorporate a degree of redundancy employ multiple air-moving devices whose speeds can be ramped up in terms of their air delivery capacity if it is detected that there is a failure or need within the system to do so. However, desired chip-operating power levels are nonetheless now approaching the point where air cooling is not the ideal solution for all parts of the system in all circumstances. While it is possible to operate fans and blowers at higher speeds, this is not always desirable for acoustic reasons. Accordingly, the use of direct cooling through the utilization of a refrigerant and a refrigeration system becomes more desirable, especially if faster chip speeds are the goal.
However, it is difficult to build redundancy into systems employing refrigerants. Such redundant systems naturally require the utilization of at least two separate refrigeration systems. This means that at least two motor-driven compressors are required. However, it is well recognized that the compressor, representing a major moving part apparatus, is also one which is prone to mechanical failure. The desire for zero down time and minimum maintenance requirements also make the utilization of multiple compressors difficult. These compressors should be designed, controlled and set up so that various failure modalities do not bring the entire computer system down nor risk damage to the components within the system. Furthermore, one should also be concerned about refrigerant leaks. Accordingly, the refrigerant systems for redundant cooling must be designed so that the refrigerant loops are not in flow communication with one another so that a leak in one loop would bring down the whole system. However, there are great practical difficulties in doing this since it requires two separate loops which are maintained in flow-wise isolation from one another and yet, at the same time, requires the utilization of refrigerant loops which are in very close thermal proximity with one another at the point within a cold plate which is attached (or otherwise thermally coupled) to the electronic circuit module or system to be cooled.
While certain electronic components or modules produce relatively large amounts of thermal energy, it is often the case that these modules are employed in conjunction with other electronic circuit components which also require some degree of cooling but do not operate at temperatures so high as to require direct cooling via a cold plate and/or refrigerant system. If modules of varying thermal energy output are employed in the same system, it is therefore desirable that the cooling systems employed for the lower thermal output modules be cooled in a manner which is compatible with cooling systems employed for the higher temperature modules. To the extent that a degree of cooperation between these systems can be provided, the net result is a system which is even more reliable and dependable. Nonetheless, these dual cooling modalities must be accommodated within a single cabinet or frame.
Another very desirable feature of any system which is employed to cool electronic devices and systems, particularly computer systems, is that a separate chilled water source not be necessary. While in some situations where the requirements are such that the inconvenience of chilled water plumbing is offset by the needs and/or desires for extremely rapid computation and computer throughput, nonetheless, less stringent requirements for computational speed are nonetheless preferably addressed through the utilization of machines which are air cooled. This cooling methodology is desirable in that it permits the utilization of stand-alone units. These self-contained units are, everything else being equal, a generally preferred solution to providing data processing server solutions.
There are yet other requirements that must be met when designing cooling units for computer systems, especially those which operate continuously and which may in fact be present in a variety of different thermal environments. Since computer systems run continuously, so must their cooling systems unlike a normal household or similar refrigerator which is operated under a so-called bang-bang control philosophy in which the unit is alternating either totally on or totally off. Furthermore, since large computer systems experience, over the course of time, say hours, variations in user load and demand, the amount of heat which must be removed also varies over time. Therefore, a cooling unit or cooling module for a computer system must be able not only to operate continuously but also be able to adjust its cooling capability in response to varying thermal loads. And furthermore, since it is intended that these modular cooling units be used in groups of two or more to assure redundancy and since not all of these units are always intended to be operating at the same time, there will be times when the thermal load is very small. Therefore, problems associated with low speed motor/compressor operation must be addressed along with problems associated with starting and/or stopping the cooling unit when, for example, normal scheduled switching between modular refrigeration units occurs.