Modern data centers house thousands of servers, and each server typically includes two or more heat-generating microprocessors. Each microprocessor can easily produce more than 40 thermal watts per square centimeter, and future microprocessors are expected to produce even higher heat fluxes as semiconductor technology continues to progress. It follows that the total amount of heat generated by all servers in a data center is substantial. Unfortunately, removing heat from the data center using conventional systems is costly and inefficient. For example, removing heat by air conditioning requires significant capital expenditures on large air conditioning units as well as significant ongoing operating expenditures to power the air conditioning units. The units suffer from poor thermodynamic efficiency, which translates to high utility bills for data center operators. To reduce the cost of operating data centers, and thereby reduce the cost of cloud storage services that rely on data centers, there is a strong need to cool servers more efficiently.
According to the U.S. Department of Energy, nearly three percent of all electricity used in the United States is devoted to powering data centers and computer facilities. Approximately half of this electricity goes toward power conditioning and cooling. Increasing the efficiency of cooling systems for data centers and computer facilities would lead to dramatic savings in energy nationwide. More efficient cooling systems are also needed in transportation systems due to increasing adoption of hybrid and electric vehicles that rely on complex electrical components, including batteries, inverters, and electric motors, which produce significant amounts of heat that must be effectively dissipated. Cooling systems capable of more efficiently cooling these electrical components would translate to increased range and utility for these vehicles.
Presently, the majority of computers (e.g. servers and personal computers) in residential and commercial settings are cooled using forced air cooling systems in which room air is forced, by one or more fans, over finned heat sinks mounted on microprocessors, power supplies, or other electronic devices. The heat sinks add mass and cost to the computers and place mechanical stress on the electronic components to which they are mounted. If a computer is subject to vibration, such as vibration caused by a fan mounted in the computer, a heat sink mounted on top of a microprocessor can oscillate in response to the vibration and can fatigue the electrical connections that attach the microprocessor to the motherboard of the computer.
Another downside of air cooling systems is that cooling fans commonly operate at high speeds and can be quite noisy. When many computers are collocated, such as in a data center, the collective noise produced by the computer fans can require service personnel to wear hearing protection. As air passes over electronic devices in the computers, the air, which is at a lower temperature than the surfaces of the electronic devices, absorbs heat from the electronic devices, thereby cooling the devices. These air cooling systems are inherently limited in terms of performance and efficiency due to the low specific heat of air, which is much lower than the specific heat of water and other coolants. For example, dry air at 20° C. and 1 bar, has a specific heat of about 1,007 J/(kg-K), whereas water at 20° C. has a specific heat of about 4,181 J/(kg-K). Due to air's low specific heat and low density, high flow rates are required to ensure adequate cooling of even relatively small heat loads.
Electronic components within a typical server chassis can produce a thermal load of about 500 watts. The amount of airflow required to cool the components can be calculated with the following equation:
      flo    ⁢                  w        .            air        =      Q                  c        p            ×      r      ×      Δ      ⁢                          ⁢      T      where fl{dot over (o)}wair is air flow rate, Q is heat transferred, cp is the specific heat of air, r is density of the air, and ΔT is the change in temperature between the air entering the server chassis and air exiting the server chassis. Where the thermal load of the server is 500 W and the maximum allowable ΔT is about 30 degrees, the server chassis will require about 53 cubic feet per minute (cfm) of air flow. For an installation of 20 servers, which is common in computer rooms of small businesses and academic institutions, over 1,000 cfm of air flow is required to cool the servers. Achieving adequate cooling capacity in this scenario requires two air conditioning units sized for a typical U.S. home and an appropriately sized air handler and ducting to deliver cool air to the room. Modern data centers, which can have tens of thousands of servers, must be equipped with many computer room air conditioning (“CRAC”) units each designed to cool and circulate large amounts of air. The CRAC units are large and expensive and must be professionally installed and often require substantial modifications to the facility, including installation of structural supports, custom air ducting, and electrical wiring. After installation, CRAC units require frequent preventative maintenance in an attempt to avoid unplanned downtime. Simply delivering large amounts of cool air to the data center will not ensure adequate cooling of the servers. Special care must be taken to deliver cool air to the servers without the cool air first mixing with warm air exhausting from the servers. This can require installation of special airflow management products, such a raised floors, air curtains, and specially designed server enclosures, to assist with air containment. These products can significantly increase the build-out cost of a data center per square foot and, inevitably, do not succeed at isolating cold air from warm air. Therefore, to ensure that sensitive components within the servers do not overheat, most data centers are forced to increase flow rates of cool air well above theoretical values as well as decrease the set point temperature of the room. The result is greater power consumption by the CRAC units and higher cooling costs for the data center.
Many electronic devices operate less efficiently as their temperature increases. As one example, a typical microprocessor operates less efficiently as its junction temperature increases. FIG. 64 shows a plot of power consumption in watts versus junction temperature. The bottom curve shows static power consumption of a microprocessor and the top curves show total power consumption for switching speeds of 1.6 GHz and 2.4 GHz, respectively. Total power consumption includes both static power consumption and dynamic power consumption, which varies with switching frequency. As shown in FIG. 64, as the temperature of the microprocessor increases, it consumes more power to provide the same performance. In air cooling systems, it is common for fully utilized microprocessors to operate at or near their maximum rated temperature, resulting in poor operating efficiency. In the example shown in FIG. 64, the microprocessor uses over 35% more power when operating at 95 degrees C. than when operating at 45 degrees C. To conserve energy, it is therefore desirable to provide a cooling system that will allow the microprocessor to operate consistently at lower temperatures. Providing a consistently lower operating temperature for the microprocessor can also extend its useful life and can avoid unnecessary throttling or downtime of the computer due to an unsafe junction temperature.
Operating speeds of next generation microprocessors will continue to increase, as will heat fluxes (defined as heat load per unit area) produced by those next generation microprocessors. Conventional air cooling systems will soon be incapable of efficiently and effectively cooling these next generation microprocessors. To effectively cool next generation microprocessors, it is desirable to provide a cooling system that is significantly more effective and efficient than existing air cooling systems and is capable of managing high heat fluxes that will be produced by next generation microprocessors.
Pumped liquid cooling systems can provide improved thermal performance over conventional air cooling systems. Pumped liquid cooling systems typically include the following items connected by tubing: a heat sink attached to the microprocessor, a liquid-to-air heat exchanger, and a pump. A liquid coolant is circulated through the system by the pump. As the liquid coolant passes through channels in the heat sink, heat from the microprocessor is transferred through the thermally conductive heat sink to the coolant, thereby increasing the temperature of the coolant and transferring heat away from the microprocessor. The heat sink is typically designed to maximize heat transfer by maximizing the surface area of the channels through which the liquid passes. For example, the heat sink can be a micro-channel heat sink that utilizes fine fin channels through which the liquid coolant flows. The heated liquid coolant exiting the heat sink is then circulated through a liquid-to-air heat exchanger to reduce the temperature of the liquid coolant before it is circulated back to the pump for another cycle.
Use of closed liquid cooling systems is beginning to migrate from high performance computers to personal computers. Unfortunately, existing liquid cooling systems have performance constraints that will prevent them from effectively cooling next generation microprocessors. This is because liquid cooling systems rely solely on transferring sensible heat by increasing the temperature of a liquid coolant as it passes through a heat sink. The amount of heat that can be transferred is a function of, among other factors, the thermal conductivity of the fluid and the flow rate of the fluid. Dielectric fluids do not have sufficient thermal conductivities to be used in liquid cooling systems. Instead, water or a water-glycol mixture is commonly used due its significantly higher thermal conductivity. Unfortunately, if a leak develops in a liquid cooling system that uses water, the water will destroy the server and potentially an entire rack of servers. With the price of a single server being thousands of dollars, many data center operators are simply unwilling to accept the risk of loss that water-based liquid cooling systems present.
While more effective than air cooling, transferring heat by sensible heating requires significant flow rates of liquid coolant, and achieving high flow rates often necessitates high fluid pressures. Consequently, a liquid cooling system designed to cool a modern microprocessor can require a large pump, or a series of small pumps positioned throughout the liquid cooling system, to ensure an adequate liquid coolant pressure and flow rate. Operating large pumps, or a series of small pumps, uses a significant amount of energy and diminishes the efficiency of the liquid cooling system. Moreover, using a series of small pumps increases the probability of the liquid cooling system experiencing a mechanical failure, which translates to unwanted facility downtime.
Although liquid cooling systems have proven adequate at cooling modern microprocessors, they will be unable to adequately cool next generation microprocessors while maintaining practical physical dimensions and specifications. For instance, to cool a next generation microprocessor, liquid cooling systems will require very high flow rates (e.g. of water), which will require large, heavy duty cooling lines (e.g. greater than ¾″ outer diameter), such as rigid copper tubing or reinforced rubber cooling lines, that will be difficult to route in any practical manner into and out of a server housing. If installed in a server, these large plumbing lines will block access to electrical components within the server, thereby frustrating maintenance of the server. These large plumbing lines will also prevent drawers on a server rack from opening and closing as intended, thereby preventing the server from being easily accessed and further frustrating maintenance of the server. As mentioned above, water poses a catastrophic risk to servers, and increasing the pressure and flow rates of water into and out of servers only increases this risk. Consequently, increasing the capabilities of existing liquid cooling systems to meet the cooling requirements of next generation microprocessors is simply not a practical or viable option. Without further innovation in the area of cooling systems, the implementation of next-generation microprocessors will be hampered.
As noted above, liquid cooling systems commonly rely on flowing liquid water through channels in finned heat sinks. The heat sinks are often indirectly coupled to a heat source via a metal base plate that is mounted on the heat source using thermal paste, such as solder thermal interface material (STIM) or polymer thermal interface material (PTIM), and/or a direct bond adhesive. While this approach can be more effective than air cooling, the intervening materials between the water and the heat source induce significant thermal resistance, which reduces the overall efficiency of the cooling system. The intervening materials also add cost and time to manufacturing and installation processes, constitute additional points of failure, and create potential disposal issues. Finally, the intervening materials render the system unable to adapt to local hot spots on a heat source. Consequently, the liquid cooling system must be designed to accommodate the maximum anticipated heat load of one or more localized hot spots on the surface of the heat source (e.g. to adequately cool one hot core of a multicore processor), resulting in additional cost and complexity of the entire liquid cooling system.
Unlike water, dielectric coolants can be placed in direct contact with electronic devices and not harm them. Unfortunately, some dielectric coolants have a lower specific heat than water, so they are not well suited for use in single-phase pumped liquid cooling systems. For instance, some dielectric coolants, such as certain hydrofluoroethers have a specific heat of about 1,300 J/(kg-K), whereas water has a specific heat of about 4,181 J/(kg-K). This means that that cooling a microprocessor by sensibly warming a flow of dielectric coolant will require a flow rate about four times higher than a flow rate of water used to cool a similar microprocessor by sensibly warming the flow of water. This higher flow rate requires more pump power, which translates to lower cooling system efficiency.
As an alternative to pumped liquid systems, dielectric coolants can be used in immersion cooling systems. Immersion cooling is an aggressive form of liquid cooling where an entire electronic device (e.g. a server) is submerged in a vat of dielectric coolant (e.g. HFE-7000 or mineral oil). Unfortunately, immersion cooling vats are large, costly, and heavy, especially when filled with dielectric coolant, which can have a density significantly higher than water. Existing vats hold upwards of 250 gallons of coolant and can weigh more than 8,000 pounds when filled with coolant. Typically, a room must be specially engineered to accommodate the immersion cooling vat, and containment systems need to be specially designed and installed in the room as a precaution against vat failure. When using 250 gallons of coolant, the cost of the coolant becomes a significant capital expenditure. Certain coolants, such as mineral oil, can act as solvents and, over time, can remove certain identifying information from motherboards and from other server components. For example, product labels (e.g. stickers containing serial numbers and bar codes) and other markings (e.g. screen printed values and model numbers on capacitors and other devices) are prone to dissolve and wash off due to a continuous flow of the coolant over all surfaces of the server. As the labels and dyes wash off the servers, the coolant in the vat can become contaminated and may need to be replaced, resulting in an additional expense and downtime. Another downside of immersion cooling is that servers cannot be serviced immediately after being withdrawn from the vat. Typically, the server must be removed from the vat and permitted to drip dry for a period of time (e.g. 24 hours) before a professional can service the server. During this drying period, the server is exposed to contaminants in the air, and the presence of mineral oil on the server may attract and trap contaminants on sensitive circuitry of the server, which is not desirable.
Another cooling approach, known as spray cooling or spray evaporative cooling, relies on atomized sprays. In this approach, atomized liquid coolant is sprayed directly on a surface through air or vapor. As a result, small droplets impinge on the heated surface forming a thin film of liquid directly on a heated surface. Heat is then transferred from the heated surface to the liquid either by sensible heating of the bulk liquid or by boiling off of a fraction of the liquid through latent heating. This is a very efficient method of removing high heat fluxes from small surfaces. Unfortunately, the margin for error in spray cooling systems is very narrow and the onset of dry out and critical heat flux is a constant concern that can have catastrophic consequences. Critical heat flux is a condition where evaporation of coolant from the surface to be cooled prevents atomized liquid from reaching and cooling the surface, often resulting in run-away device temperatures and rapid failure. Great care must be taken to ensure uniform coverage of the spray on the heated surface and adequate drainage of fluid from the heated surface. Although achievable in static laboratory settings, mainstream adoption of spray cooling has been hampered by several factors. First, spray cooling requires a significant working volume to enable atomized sprays to form, which results in non-compact cooling components, making it impractical for packaging in most consumer products. Second, atomizing the liquid requires a significant amount of pressure upstream of the atomizer to generate an appropriate pressure drop at the atomizer-air interface to enable atomized sprays to form. Maintaining this amount of pressure within the system consumes a significant amount of energy. Third, high flow rates of atomized sprays are required to prevent dry out or critical heat flux from occurring. In the end, it has proven difficult to design a practical and compact spray cooling system, despite a large amount of time and effort that has been expended to do so.
In view of the foregoing discussion, efficient, scalable, high-performing methods and apparatuses are needed for cooling devices, such as microprocessors and power electronics that produce high heat fluxes.