The present application relates generally to an improved data processing apparatus and method and more specifically to mechanisms for controlling depth and latency of exit of a virtual processor's idle state in a power management environment.
There is an emerging customer requirement for better power and thermal management in server systems. Customers increasingly expect systems to behave in such a way as to be power-efficient. Customers also want the ability to set policies that trade off power and performance in order to meet their particular objectives. For example, customers want to be able to over-provision their installations relative to the nominal maximum power and temperature values of the systems that they install, but be able to take advantage of the variability in workloads and utilization to ensure that the systems operate correctly and within the limits of the available power and cooling.
IBM®'s EnergyScale™ system controls the power and temperature of running systems in a performance-aware manner under the direction of a set of policies and objectives specified through EnergyScale™ system's user interfaces. To do so, the EnergyScale™ system implements detailed, periodic measurement of processor core power and temperature, measurement of the power consumed by the entire system board as well as any plugged-in processor cards and measurement of the memory power and temperature to the system. EnergyScale™ system uses the results of these measurements to adjust the system's operation and configuration to meet specified objectives for power, temperature, and performance by using closed-loop feedback control operating in real time.
One of the tools used by the EnergyScale™ system to control power is to adjust the frequency and voltage of the processor chips and cores in the system to control the power dissipation as a function of the user specified energy scale policy. Early EnergyScale™ system designs required that the voltage and frequency of all central processing units (CPUs) in the system be maintained at the same value. As the EnergyScale™ system design and implementation becomes more sophisticated, it becomes possible to have cores in a system running at different frequencies and voltages and allows the implementation of more sophisticated power savings algorithms. A side effect of the more sophisticated implementation is that energy savings opportunities increase with the increasing granularity of the EnergyScale™ system design.
One of the enhancements to the EnergyScale™ system design is the ability to set an idle state from among different possible idle states of a processor core. A processor core in an idle state saves power by not executing instructions. The amount of power saved depends on the amount of the processor's resources that can be disabled when entering the idle state. The greater the amount of the processor resource that is turned off, the greater the power savings, and correspondingly the greater the latency of exit when exiting the idle state and re-enabling those processor resources that were previously disabled. The greater latency in exiting the idle state translates to greater processor resources that have to be enabled when exiting the idle stage.
On a logically partitioned system, when an operating system (OS) determines that its thread in a dedicated processor partition is idle, the OS calls a virtualization layer (i.e., such as a hypervisor) so that the virtualization layer can place the central processing unit (CPU) corresponding to the virtual processor in a low power idle state.
However, when the OS calls the virtualization layer, the OS has different expectations regarding the latency of exit from an idle state of the virtual processor. When referring to the phrase “latency of exit from an idle state”, one is referring to the measure of how quickly the OS will regain control of its virtual processor after a qualifying event wakes up the virtual processor from its idle state. For example, if the OS is folding its virtual processor, the OS may not have an expectation that the virtual processor will respond to Input/Output (I/O) or timer interrupts. Moreover, the OS expects to regain control of the virtual processor in at most, for example, one second after the virtual processor is awakened by the OS. In other cases for idle management, the OS may tolerate a latency of only a few micro-seconds when the virtual processor exits its idle state and the OS expects the same latency when the virtual processor is presented an I/O or timer interrupt.