Modern processors may be able to operate in different power saving states. In some implementations, these power saving states are defined by an Advanced Configuration and Power Interface (ACPI) specification, such as Section 8 of the ACPI specification version 6.1, the entire disclosure of which is incorporated by reference. Taking a simplistic example of a one core processor, the processor may execute instructions while in an active state, which may be labeled C0.
If the processor is waiting on another system component, which will be generally slower than the processor (for example, main memory or secondary storage), the processor may enter a power saving state. For example, when a halt instruction, rather than a substantive instruction, is being executed by the processor, the processor may enter a first power saving state, labeled C1. The clock of the execution unit may be gated so that in the power saving state, the clock is not passed through to the execution units and the execution units are no longer consuming power. The processor can resume operation by re-allowing the clock to be passed through.
If the processor will be idle for a longer period of time, the processor may enter another, deeper power saving state. For example, this power saving state may be labeled C3. In this second power saving state, the level 1 cache, and in some instances, the level 2 cache, of the processor is flushed so that power to the caches can be removed. The level 1 and 2 caches are generally volatile, meaning that removing power will cause the contents to be erased. To refill the cache takes more time, so the processor requires additional time to return to the active state from the second power saving state. However, when the processor will be idle for a longer period of time, such as when requesting data from main memory, entering the second power saving state may be worthwhile.
Still further power saving states are available in various processor architectures. For example, a third power saving state may save off state data from the processor's volatile storage and then remove power to most of the functional units of the processor. For example, caches, pipeline registers, architectural registers, clock distribution circuitry, branch predictors, arithmetic units, etc. may be powered down. The clock generation circuitry may remain powered to avoid the additional latency of generation of the clock resuming and stabilizing. In order to resume execution, the saved state is reloaded into the processor.
The third power saving state may be worthwhile when the processor is waiting for a much higher-latency task, such as a disk access or when the processor is idle/inactive such as while waiting for the next transaction to arrive. The tradeoffs between the latency and power savings of power saving states may be determined by the designers of the processors and may be systematized in a power management unit of the processor.
In the active state (C0), there may be multiple performance levels. For example, a highest performance state (labeled P0) may be the highest frequency at which the processor can operate. If the processor is lightly loaded, the processor may operate at a lower performance state, such as a state labeled P1. Different processor architectures, and even different processor models sharing an architecture, may have different sets of performance states. For example, defined performance states from P0 to Pn, where n is an integer greater than one, may be defined by the processor. In some processors, a range of frequencies are available and may not necessarily map to specific P numbers. Instead, the frequency of the processor may be adjustable anywhere within the range subject to a set increment, or granularity. For example, the frequency of the processor may be adjusted in increments of 100 MHz.
In more modern processors, the processor includes more than one core. Further, each core may, depending on the processor model, be able to execute multiple threads. Except in the very rare instances in which all of the threads executing across all the cores of the processor consistently present a uniform load, certain cores of the processor may be more lightly loaded at certain times than others. Therefore, the processor power management may adjust the power saving state of each core separately.
Further, some recent processors allow the performance state of the processor to be adjusted per core. In other words, in addition to being able to enter individual cores into separate power saving states, the frequency of each core may be adjusted independently in the active state. In other implementations, groups of cores may be controlled together. For example, in an eight-core processor, each pair of cores may be set to a frequency that is independent of the other pairs of cores.
Simply as a graphical illustration and not representative of any specific processor, FIG. 1 depicts a processor with four power states: three power saving states and an active (C0) power state 10. The C0 power state includes five performance states: performance state 0 (20-0), performance state 1 (20-1), performance state 2 (20-2), and performance state 3 (20-3). Simply for illustration, the performance states 20-0, 20-1, 20-2, and 20-3 correspond to frequencies of 2.3 GHz, 2.1 GHz, 1.8 GHz, and 1.2 GHz, respectively.
Further, in some multi-core processors, when fewer than all cores are in the C0 state 10, one or more higher performance states (sometimes referred to as turbo states) are available. The ability to run a subset of cores at a higher frequency is generally due to thermal management. The processor and any associated heat sink may only be able to dissipate the heat generated from a subset of cores operating at a higher frequency. In FIG. 1, performance state 24 may allow a single core to operate at 3.0 GHz when the other cores are in a power saving state.
In FIG. 1, inactive power saving states include an idle state 32-1 (C1, also called a first power saving state), a second power saving state 32-2 (C3), and a third power saving state 32-3 (C6). Although performance cannot be negative, the depth of the graphical bars for the power saving states are a graphical reminder of the latency involved in returning from a power saving state.
In other words, the amount of time required to return to the active state 10 from the first power saving state 32-1 is much less than the latency to return from the second power saving state 32-2, which, in turn, is less than the time required to resume execution from the inactive state 32-3. For example only, the second power saving state 32-2 may involve flushing the L1 and L2 caches, while the power saving state 32-3 may include saving the state of the core and removing power from most functional units. To be clear, the latency depicted for the power saving states 32 is a different, not comparable, scale to the frequency depicted for the performance states 20.
The labels C1, C3, and C6 were shown in FIG. 1 simply to illustrate the fact that certain processors implement power saving states that are a subset of the power saving states defined by the manufacturer or by the ACPI specification. In addition, a processor manufacturer may define additional power saving states that are not industry standard (though they may be defined by the processor manufacturer according to the format of the ACPI specification).
In FIG. 2, a more granular array 40 of performance states may be available. For example, the frequency of the processor may be adjusted arbitrarily by predefined increments. In some implementations, higher performance states 44 of the array 40 require that at least some of the cores be in one of the power saving states 32.
Operating systems include frequency governors designed with various tradeoffs between power and performance. These frequency governors may instruct the processor at which frequency to operate each processor core. For example, Linux distribution Ubuntu, from Canonical Ltd., includes several frequency governors grouped into two classes. See the following table:
CLASSGOVERNORTYPESHORT DESCRIPTIONCpufreqPerformancestaticAll cores at maximum(turbo) frequency.PowersavestaticAll cores at minimumfrequency.OndemanddynamicGovernor changes corefrequencies depending oncore utilization.ConservativedynamicSimilar to Ondemandgovernor, but changesfrequencies moregradually.Userspace—Frequencies can bereconfigured by theadministrator.IntelPerformancestaticSimilar to CpufreqP-StatePerformance governor.PowersavedynamicSimilar in strategy toCpufreq Ondemandgovernor, but with adifferent implementation.
An operating system user with root privileges may be able to switch governors within a class while the operating system is running. However, changing the class may require modifying a kernel perimeter and rebooting the system. These prior art frequency governors often fail to save power at high loads or suffer significant performance loss. Failing to save power increases the energy footprint of the device, the negative effects of which are magnified at cloud scale. On the other hand, failing to maximize performance may require additional systems or more stringent hardware specifications in order to handle a given load. For a fixed hardware environment, failing to maximize performance may lead to worse application-level performance for customers, such as slower response times or lower transaction throughput.
The background description provided here is for the purpose of generally presenting the context of the disclosure. Work of the presently named inventors, to the extent it is described in this background section, as well as aspects of the description that may not otherwise qualify as prior art at the time of filing, are neither expressly nor impliedly admitted as prior art against the present disclosure.