1. Technical Field
The present invention relates generally to an improved data processing system and in particular to a method and apparatus for processing data. Still more particularly, the present invention provides a method, apparatus, and computer instructions for handling alerts for power and thermal events.
2. Description of Related Art
Servers continue to get faster and include more processors at a rapid pace. With these changes, the problem of heat dissipation increases. The problem with heat dissipation increases as the density of servers increase. For example, the number of servers that may be located in a particular area increases when the servers are mounted on racks, rather than being placed on a table or on the floor. Consequently, many companies are using heat dissipation as a criteria for purchasing computers, since it is becoming more difficult to cool server farms with a large number of processors. The decision to either use high performance systems with expensive cooling systems or use low powered processors with lower performance currently has to be made by companies and consumers.
High performance computers in high densities, such as those placed into rack systems, may overheat and have system failures. More specifically, these failures may include system crashes, resulting in hardware damage. Overheating causes increased risk of premature failure of processors, chips, and disk drives.
Currently, monitoring systems are used to monitor server computers. Many computers have integrated temperature monitoring. Some computers include a temperature monitoring utility that allows the temperature of a processor to be checked or monitored.
Currently, problems with heat dissipation may be reduced using a cooling system, such as a liquid cooling system available for rack mounted servers. This type of cooling system achieves efficient heat exchange through the mounting of a refrigeration cycle in the rack containing the servers, and connecting a cooling pipe from the servers to a cooling liquid circulation pipe mounted in the rack. Although cooling systems may aid in heat dissipation, these types of systems are expensive and are subject to failures.
In addition to problems with thermal dissipation, the high density of servers in server farms result in the consumption of large amounts of power. In some cases, the power consumed may overload power circuits for a server farm. In such a case, the power supply may fail causing the entire server farm to shut down. One solution for managing power overloads is to shut down one or more servers on a rack in a server farm to reduce power consumption. Currently, the process for handling a thermal or power problem involves alerting the operating system and powering off the server system to prevent the power or thermal overload. Such a procedure is initiated when the power consumption increases beyond some acceptable threshold. This process also is initiated for thermal conditions when other methods, such as increasing air flow or the use of liquid cooling systems have not provided the required relief for the overload condition.
Therefore, it would be advantageous to have an improved method, apparatus, and computer instructions for managing power and thermal events without shutting down a data processing system.