1. Technical Field
This invention relates to power reduction in processor circuits, and in particular to systems and methods for controlling power consumption by execution units.
2. Background Art
Each new generation of semiconductor process technology allows the transistor counts and clocking frequencies of processor chips to increase. With more transistors operating at higher frequencies, processor chips consume significantly more power with each new generation of process technology. The increased power consumption and accompanying heat dissipation create significant design problems. For example, the battery life of mobile systems must be expanded to compensate for the power requirements of new processors, and the thermal solutions required to maintain processor chips within their specified operating temperature ranges become more complex as more heat is generated.
Clock gating is a well-known technique for reducing the power consumed (and dissipated) by processors. The various clock gating techniques decouple a clock signal from different parts of a computer system when certain trigger conditions are detected. When the clock signal is removed, logic in the affected part of the computer system is no longer charged and discharged, thus reducing power consumption. The power dissipated by the clock network itself is also reduced since it drives a smaller portion of the processor system. The overall power savings can be significant.
Various trigger conditions have been employed to gate the clock to different components of processor systems. For example, the clock signal may be decoupled from logic associated with the monitor and peripheral devices when no keyboard or mouse activity (inputs) is detected for a selected interval, e.g. ten minutes. The clock signal is restored to the affected components when an input is detected. At a finer level of control, execution logic within the processor may be decoupled from the clock if it does not detect any incoming instructions to be processed. The execution logic is powered up when an appropriate instruction is detected in the processor pipeline.
The methods employed to accomplish clock gating must not interfere with operation of the processor system. For example, neither data nor instructions can be lost when the system transitions between power on (clock signal coupled) and power off (clock signal deoupled) states. In some cases, this is accomplished by trading performance for power reduction. For example, instruction processing may be delayed following release of the clock gating condition to accommodate the power up latency. The delay ensures that the logic is fully powered before it a, resumes executing instructions. In some cases, additional logic may be used to avoid dropping data or instructions. However, this has its own associated performance and die area costs.
These considerations limit the use of clock gating with certain common stall conditions. For example, cache misses are relatively common for software workloads that have large working sets. Execution resources may be stalled for approximately 30% of their execution time, waiting for data to be returned from higher level memory structures, e.g. storage structures closer to main memory. Gating the clock signal to the execution resources during these stalls could save significant power. However, incurring an additional delay following release of each stall to accommodate the power up latency could lead to substantial performance degradation as well as increased design complexity. In addition, several different conditions can generate stalls of various latencies, making it difficult to identify stalls caused by long latency memory loads. This is further complicated when multiple stall conditions occur at the same time.
The present invention addresses this and other limitations of conventional power reduction techniques.