Continuous improvements in the technology used to manufacture integrated circuits has lead to the development of silicon chips with increasingly larger number of discrete components which operate at faster and faster speeds. As the number of components and the operating speed increases, the power needed to operate such chips also increases. To prevent chip overheating and consequent change in the discrete components operating characteristics, the chip power consumption must be promptly dissipated into the environment. In addition, the chip must be interconnected electrically to other components to provide its designed function. Both the electrical inter-connectivity and thermal dissipation functions are provided by mounting the given chip in an assembly which may include a heat sink. The assembly must be selected carefully because its thermal characteristics directly determine the system reliability.
Depending on the power dissipation level required by the chip, several cooling techniques may be used to aid the heat dissipation process. When the total power consumption is low, for example, below about 3 watts per chip, a simple flat plate heat sink, and low velocity air convection or even natural convection is sufficient to adequately cool the chip below its maximum allowed operating temperature, typically known as maximum junction temperature. As the power dissipation increases, the complexity of the cooling techniques used also increases. Very high power dissipation, for example, above about 50 watts per chip, require the use of advanced cooling technologies, such as, found in the use of heat sinks with integrated heat pipes, and/or the use of solder to thermally couple the chip to the heat sink. In between these two extreme situations, the use of conventional finned heat sinks and an adequate level of air forced convection typically provides the desired cooling capability.
The operating thermal characteristic of a chip using a conventional cooling technology is relatively simple; however, it is difficult to determine a given chip operating temperature. The chip operating temperature relative to the cooling fluid temperature is proportional to the chip power dissipation. The proportionality constant is a function of the assembly thermal characteristics and represents the assembly thermal resistance, typically measured in .degree. C./Watt. Chip assemblies using conventional heat sinks have constant thermal resistance mainly because for most applications it is not practical to change the cooling fluid velocity. Here, the global fluid velocity must be kept constant because the same fluid flow stream is also used to cool other devices not part of the given assembly. Thus, with the use of conventional heat sinks, it is difficult to estimate the final operating temperature of a chip because the fluid velocity and temperature around the assembly are not well defined until the thermal characteristics of all devices located upstream and downstream are well defined and the flow distribution in the system is well established.
When the system includes a single device sensitive to its operating temperature, active control is possible. An example of this particular situation is described in U.S. Pat. No. 5,491,610 (Mok et al.), assigned to International Business Machines Corporation, Armonk, USA, and the disclosure of which is incorporated herein by reference, where active means are provided to control the temperature gradient in a particular location of the assembly using temperature sensing and controller circuits.
An assembly thermal resistance typically includes the chip thermal resistance, heat sink thermal resistance, and heat-sink to fluid thermal resistance. In most air cooled applications, the last thermal resistance is much larger than the other two resistances because of poor air thermal conductivity relative to the thermal conductivity of the other components. In this case, the assembly thermal resistance becomes direct function of the inverse of the product between the heat sink area and the convective heat transfer coefficient. In applications which use liquids to cool the electronic assemblies, the thermal resistance in the fluid phase is much lower than the fluid thermal resistance of assemblies using air cooling. Here, the devices operating temperature become very sensitive to the fluid velocity and temperature distribution and the system local power distribution. In both cases, chip performance is not well known until the whole system is assembled and tested, and it is again difficult to know in advance the final operating temperature of the chips and other electrical devices in a system even on liquid cooled applications.
A similar but more complex thermal problem arises when conventional cooling devices are used to cool large chips with significant differences in power density dissipation over the chip area. For example, advanced devices which include both system logic and large cache memory on one chip belong to this group. Here, a large chip is needed to provide sufficient area for both electric functions; however, the power dissipation density from the small part of the chip used to built the system logic is easily an order of magnitude larger than the power dissipation density of the rest of the chip which has the memory function. To equalize the power dissipation into the heat sink, this type of chip requires the use of a large heat spreader which add significant weight to the assembly, and introduce significant challenges to the system reliability and thermal design engineering needed to build the needed assembly.
The use of a conventional heat sink to cool a chip also introduces an additional thermal problem when the chip of interest includes power management capability.
Here, the chip power consumption changes on demand and increases to its maximum design power consumption level when the chip is fully functional or reduces to a low power consumption level when the device is on a stand-by mode. Power management is used mainly to conserve energy on portable electronic appliances of all kinds, or on environmentally friendly systems, or on systems where the maximum power available is constrained. The power management capability introduces additional thermal cycles to the assembly which can become a significant addition to the total number of thermal cycles experienced by the assembly when determining its reliability level. Stand-by power levels are typically about an order of magnitude lower than the fully operational power consumption level. Furthermore, the operating temperature of a chip relative to the fluid temperature varies linearly with the chip power usage when using conventional cooling methods. Thus, when on stand-by, such assemblies will power down and change the device operating temperature to a level closer to the cooling fluid temperature. The consequent change in temperature between full power and stand-by power is then in the same range as the change in temperature between full power and when the device is power-off. Therefore, the assembly total number of thermal cycles must include both the traditional power-on/power-off cycles and the power management induced number of stand-by cycles in each power-on/power-off cycle.
Thus, the consequent increase in the total number of thermal cycles could significantly shorten the product life due to the assembly interconnection reliability limitations. The selection of thermal cooling technology also determines the thermal transient response of a given chip and can be an important design factor when the devices must reach operating temperature in a short time. Thermal transients are significantly larger than electrical transients, for example, easily by six orders of magnitude or more; but discrete components electrical characteristics are very sensitive to operating temperature. Therefore, all electronic devices after power-on need to wait a relatively large warm-up time before being capable to operate in a reliable mode. The assembly thermal transient increases with the assembly thermal mass. Since at a given time during the assembly warm-up time the assembly heat-up rate is proportional to the difference between the chip power dissipation and the heat sink cooling capability, the thermal transient time increases when the heat sink used has large cooling capability. Conventional heat sinks are typically selected to be capable of dissipating the maximum operating power of the given chip, thus, the assembly thermal transient response is inherently maximized.
While conventional heat sinks provide an effective method to cool heat dissipating assemblies, they also introduce many difficulties as known in the art which can contribute to short mean-time-between-failure (MTBF) of critical components, or unacceptable long warm-up times on time-critical devices. These and other difficulties can be caused by the thermal characteristics of conventional heat sinks. The thermal resistance of the heat sink of the prior art is considered relatively constant because no attempt has been made to change the air velocity circulating through the heat sink. In most systems, the air velocity is constant and not well known until the system is assembled and fully functional. Both air velocity and temperature distribution are fully dependent on all the system devices upstream and downstream from the assembly of interest, hence its determination from design data presents a very complex thermal design problem. In spite of this inherent complexity, the thermal characteristic response of such a conventional heat sink is relatively simple and operates typically in a linear or straight line fashion. Thus the device of the prior art will increase its operating temperature relative to the fluid temperature proportionally to the power dissipated by the given device.
It would be desirable then to have a heat sink which overcomes all the thermal problems stated above which are created by the use of conventional heat sinks with relatively constant thermal resistance.