Recent technologies are rapidly increasing the power density at all levels. At the computer chip level, for example, there are multi-core chips from major processor manufacturers. At the rack level, blade servers are dominating the market. In data centers, usage of the above technologies rapidly increases the power density of data centers.
Since power consumption is defined as the instantaneous rate at which energy is consumed, using a computing system (or device) of higher power density leads to higher energy consumption. Higher energy leads to higher heat generated by the computing systems and thus leads to higher temperatures to be endured by the systems. High temperature is problematic for computing systems because higher temperature leads to shorter MTBF (mean time between failures) and thus makes systems unreliable. To maintain the appropriate temperature would require stronger cooling systems, which in turn also require higher power consumption.
Thus, power and heat management are critical concerns for large computing systems. Through power and heat management, one goal is to control power consumption to avoid high temperature.
There are multiple technologies to decrease the power consumption of a computing system and thus decrease the generated heat. One such technology is known as clock throttling, where processors are switched to standby modes or deep sleep modes that consume less power. Another technology is Dynamic Voltage and Frequency Scaling (DVFS or DVS). By lowering voltage and frequency (processing speed) together, power consumption is decreased more effectively than other technologies that change frequency only. The current technologies that decrease power consumption cannot avoid decreasing frequency, which will lower the performance.
The current solutions to avoid high temperature in power and heat management fall into two categories: reactive methods and proactive methods. For reactive methods, heat sensors are installed for computing systems and signals are emitted when the temperature is higher than a threshold. Such signals may trigger a power reduction that leads to a reduction of the system's performance. For proactive methods, a power limit, which in general is less than the peak (full capacity) power, is applied such that power consumption at any time is held below the limit.
The drawback of the first method is that the power reduction, which causes a reduction in processing speed, often occurs when the processors are very busy (heavily loaded). Thus, a power cut will inevitably degrade the performance of the workload when the workload demand is high. This is like closing lanes of a highway at rush hour. The drawback of the power limit method is that the method can be too conservative, and will not use the full capacity of a computing system all the time. When the workload is bursty, a workload characteristic exhibited by many web sites and applications, power limit approach degrades the performance during a burst even when the workload is light before and after the burst. Instead of limiting the power consumption all the time, it is natural to think that we can use low power consumption when the workload is light and use high power consumption during the burst to improve the performance of the whole system while we still limit the sum of the power consumption during any period of time.