The use of thermocouples permanently mounted onchip substrates to monitor temperatures during the chip-manufacturing process is well known. Also, permanently affixed thermocouples have been employed to measure the temperature of chips during their actual operation in a completed computer system.
Where permanently mounted thermocouples are undesirable or unnecessary, a crude form of manual installation of a thermocouple has previously been employed on the back of multichip modules used in earlier computers. Such a thermocouple was temporarily installed for conducting tests to measure module operating temperatures under various system conditions. The thermocouple was attached directly to the module substrate using epoxy. Each installation required careful tedious manual labor and the use of a microscope to install the thermocouple. Extreme care was required to avoid damaging the module, both in the installation procedure as well as in the removal procedure after the test was completed. Many problems were encountered due to factors such as human error in properly mounting and locating the thermocouple. Also, there was often a problem of incomplete epoxy adherence to hold the thermocouple securely in place. Each different location in the substrate which was thermally tested often required a completely new assembly and always required a customized manual installation. In other words, once a test was completed and the thermocouple removed, the thermocouple was not reusable in a subsequent test without having to go through another tedious customized installation. In some instances the removal procedure would even cause irreparable damage to the thermocouple.
As more and more integrated circuits are packed together on single and multilayered modules and boards, the collective generation of heat becomes a serious problem. It is now common to provide either an air cooled or liquid cooled heat sink on top of a multichip module, and to provide multiple pins extending from the bottom for plugging into multilayered circuit boards. A typical multichip module is the water cooled thermal conduction module (TCM) described in the article entitled "Thermal Conduction Module: A High-Performance Multilayer Ceramic Package", IBM J. Res. Develop., Vol. 26, No. 1, January 1982, pp. 30-36. A typical multilayered circuit board comprising a 20-layer composite is described in an article entitled "A New Set of Printed-Circuit Technologies for the IBM 3081 Processor Unit", IBM J. Res. Develop., Vol. 26, No. 1, January 1982, pp. 44. In a typical present computer system, a single TCM can contain as many as 1800 pins, and nine of the TCMs can be mounted on a single multilayered circuit board. It is not unusual for a single module to dissipate several hundred watts of power, and future power consumption is expected to be significantly increased for large individual modules.
In earlier times, when a component overheated and was damaged, it was economical to merely replace it with another component. However, multichip modules and multilayered circuit boards have become too expensive for frequent replacement. This creates the need to provide more reliable modules and circuit boards, and such reliability can only be assured by keeping their operating temperatures below acceptable maximums. Therefore it becomes very important to conduct thermal tests of the prototypes during actual operation (i.e., when the module is plugged into the board) to determine their expected operating temperatures, before embarking on production. Additionally, it becomes very important to test the first production units for excessive temperature during actual operation before shipping the computer system to customers. In that regard, it is the breakdown of pin lubricating oil at high temperatures that is one of the dangers to be avoided by assuring that the modules do not overheat during operation. Such oil breakdown can permanently damage the modules so as to be inoperable. Finally, it may become desirable to be able to test units which are in the field to locate excessive heating problems. Any time a change is made to any part of a computer system, such change may have an adverse effect on some other part of the system. The ability to make spot checks during actual operation in order to monitor temperature is becoming more of a necessity than merely a desirable option.
There is also a technique which has been developed in order to predict the operational life of computer components, called dynamic burn-in. In such a technique, the computer system is deliberately stressed beyond its normal operating limits for the purpose of accelerating early-life defects. Such a dynamic burn-in will necessarily cause the circuits to heat beyond normal specifications, and it is very desirable to have a thermocouple unit monitor the temperature of critical locations on a module or board to be sure the temperature does not exceed acceptable tolerances during the testing procedure.