Heat removal is a prominent factor in a computer system and data center design. The number of IT components such as servers deployed within a data center has steadily increased as the server performance has improved, thereby increasing the amount of heat generated during the ordinary operation of the servers. The reliability of servers used within a data center decreases if the environment in which they operate is permitted to increase in temperature over time. A significant portion of the data center's power is used for thermal management of electronics at the server level.
Recent trends in computing show a trend toward higher power density. As the number of servers within a data center increases, a greater portion of the power is commensurately consumed by the data center to remove heat from electronic components within the servers. Liquid heat removal offers a solution for higher power computing racks due to the relatively higher heat capacity and greater energy efficiency possible with liquid heat removal.
Direct-to-chip liquid cooling provides a cooling solution for power-intensive processors such as central processing unit (CPU) chips and general processing unit (GPU) cards. It has the advantage of a better heat exchange efficiency and a better chip cooling performance. Also it enables the possibility of increasing server/rack power density, while keeping lower power utilization effectiveness (PUE) than a conventional air cooling solution.
A solution structure of liquid cooling includes a primary loop and a secondary loop. The primary loop provides chilled liquid to a coolant distribution unit (CDU) and return the hot liquid back. The secondary loop cycles isolated liquid to cool the processor chip. The CDU has the function of heat exchange between two loops via heat exchanger, and provides the secondary loop's liquid flow through the pump within it.
Currently, a CDU is designed with either a fixed pump speed providing constant flow rate or controlled via liquid temperature difference on the secondary loop. A CDU pump with constant speed lacks a flexibility of providing variable liquid flow, and hence wastes pumping energy in the system. A CDU with temperature difference feedback control saves partial pumping energy, but due to the slow dynamic response of the liquid system and potential non-uniform distribution of a liquid flow to each GPU cold plate, this feedback logic is insufficient dealing with an extreme condition that one or more GPU cards are running at extreme workload while others in the same rack are running at relative low workload.