A primary way in which modern microprocessors reduce their power consumption is to reduce the frequency and/or the voltage at which the microprocessor is operating. Additionally, in some instances the microprocessor may be able to allow clock signals to be disabled to portions of its circuitry. Finally, in some instances the microprocessor may even remove power altogether to portions of its circuitry. Furthermore, there are times when peak performance is required of the microprocessor such that it needs to be operating at its highest voltage and frequency. The microprocessor takes power management actions to control the voltage and frequency levels and clock and power disablement of the microprocessor. Typically the microprocessor takes the power management actions in response to directions from the operating system. The well-known x86 MWAIT instruction is an example of an instruction that the operating system may execute to request entry to an implementation-dependent optimized state, which the operating system uses to perform advanced power management. The optimized state may be a sleeping, or idle, state. The well-known Advanced Configuration Power Interface (ACPI) Specification facilitates operating system-directed power management by defining operational or power-management related states (such as “C-states” and “P-states”).
Performing the power management actions is complicated by the fact that many modern microprocessors are multi-core processors in which multiple processing cores share one or more power management-related resources. For example, the cores may share voltage sources and/or clock sources. Furthermore, computing systems that include a multi-core processor also typically include a chipset that includes bus bridges for bridging the processor bus to other buses of the system, such as to peripheral I/O buses, and includes a memory controller for interfacing the multi-core processor to a system memory. The chipset may be intimately involved in the various power management actions and may require coordination between itself and the multi-core processor.
More specifically, in some systems, with the permission of the multi-core processor, the chipset may disable a clock signal on the processor bus that the processor receives and uses to generate most of its own internal clock signals. In the case of a multi-core processor, all of the cores that use the bus clock must be ready for the chipset to disable the bus clock. That is, the chipset cannot be given permission to disable the bus clock until all the cores are prepared for the chipset to do so.
Still further, normally the chipset snoops the cache memories on the processor bus. For example, when a peripheral device generates a memory access on a peripheral bus, the chipset echoes the memory access on the processor bus so that the processor may snoop its cache memories to determine whether it holds data at the snoop address. For example, USB devices are notorious for periodically polling memory locations, which generates periodic snoop cycles on the processor bus. In some systems, the multi-core processor may enter a deep sleep state in which it flushes its cache memories and disables the clock signals to the caches in order to save power. In this state, it is wasteful for the multi-core processor to wake up in response to the snoop cycle on the processor bus to snoop its caches (which will never return a hit because they are empty) and to then go back to sleep. Therefore, with the permission of the multi-core processor, the chipset may be authorized not to generate snoop cycles on the processor bus in order to achieve additional power savings. However, again, all of the cores must be ready for the chipset to turn off snooping. That is, the chipset cannot be given permission to turn off snooping until all the cores are prepared for the chipset to do so.
U.S. Pat. No. 7,451,333 issued to Naveh et al. (hereinafter Naveh) discloses a multi-core microprocessor that includes multiple processing cores. Each of the cores is capable of detecting a command that requests the core to transition to an idle state. The multi-core processor also includes Hardware Coordination Logic (HCL). The HCL receives idle state status from the cores and manages power consumption of the cores based on the commands and the idle state status of the cores. More specifically, the HCL determines whether all the cores have detected a command requesting a transition to a common state. If not, the HCL selects a shallowest state among the commanded idle states as the idle state for each core. However, if the HCL detects a command requesting transition to a common state, the HCL can initiate shared power saving features such as performance state reductions, a shutdown of a shared phase-locked-loop (PLL), or saving of an execution context of the processor. The HCL can also prevent external break events from reaching the cores and can transition all the cores to the common state. In particular, the HCL can conduct a handshake sequence with the chipset to transition the cores to the common state.
In an article by Alon Naveh et al. entitled “Power and Thermal Management in the Intel Core Duo Processor” which appeared in the May 15, 2006 issue of the Intel Technology Journal, Naveh et al. describes a consistent C-state control architecture using an off-core hardware coordination logic (HCL), located in a shared region of the die or platform, that serves as a layer between the individual cores and shared resources on the die and platform. The HCL determines the required CPU C-state based on the cores' individual requests, controls the state of the shared resources, and emulates a legacy single-core processor to implement the C-state entry protocol with the chipset.
In the scheme disclosed by both Naveh references, the HCL is centralized non-core logic outside the cores themselves that performs power management actions on behalf of all the cores. This centralized non-core logic solution may be disadvantageous, especially if the HCL is required to reside on the same die as the cores in that it may be yield-prohibitive due to large die sizes, particularly in configurations in which it would be desirable to include many cores on the die.