1. Technical Field
The present application is directed to microprocessors. More specifically, the present invention is directed to a system, apparatus and method of providing accurate time-based counters for scaling operating frequencies of microprocessors.
2. Description of Related Art
With the rapid increase in transistor density and speed, the amount of power that may be dissipated by a chip is increasingly becoming a critical criterion in chip designs. Particularly, each successive shrink in technology provides an increase in density, allowing for a reduction in chip footprint. This allows the chip to operate at higher frequencies. Since the chip footprint is increasingly being reduced, less power is being dissipated through the chip itself. Any method, therefore, that may be used to decrease the power consumed by a chip may translate into an increase in performance.
One method that may be used to decrease the power consumption of a chip is frequency scaling. Frequency scaling allows a chip to operate at full frequency during short spans of time when high performance is needed and to operate at lower frequencies at other times. Specifically, the power consumption of a chip may be represented by the following equation:power=KαC(V2)F+QV2where V is the chip's core voltage, F is the chip's operating frequency, α is the chip's activity factor, C is the chip's effective capacitance, and K and Q are constants that are dependant of the manufacturing process among other factors. From this equation it can be seen that the operating frequency of a chip is directly proportional to its power consumption. Thus, a decrease in frequency will correspondingly lead to a decrease in power consumption.
To determine whether or not the operating frequency of a chip is to be scaled up or down, the average activity occurring in the chip over an elapsed time needs to be calculated. If the activity in the chip over the elapsed time is high, then the chip may operate at a higher frequency. If on the other hand, the activity in the chip is low, the chip may operate at a lower frequency.
In a computer system, a time-based counter is used to measure elapsed time. Obviously, the counter should increment at a constant speed. The speed is usually derived from a frequency clock. To be more specific, the speed at which the counter increments may either be 1/nth of a core frequency clock, where n is an integer (for an internal time-based counter) or determined by the rising edge of an external clock signal supplied by the system (for an external time-based counter), as described, for example, in the PowerPC Architecture Book III.
FIG. 1 is an exemplary diagram of a time-based counter in accordance with a known circuit configuration. As shown in FIG. 1, the circuit configuration includes an external timebase portion 110, an internal timebase portion 120, and a timebase value generation portion 130. The external timebase portion 110 includes a timebase (tbase) input pin 112 and edge detection logic 114. All circuit elements used in the circuit shown in FIG. 1 are clocked by an internal or core clock signal, i.e. are in the core frequency clock domain 180.
The external timebase portion 110 allows an external device to provide a clocking signal to the time-based counter via the timebase input pin 112. The edge detection logic 114 of the external timebase portion 110 generates a “tick,” i.e. an increment of a timebase value by 1, from the external timebase signal, each time a rising edge of the external timebase signal is detected. The internal timebase portion 120 is a modulo 8 (or any other arbitrary number of cycles) counter that generates a tick every 8 internal or core clock cycles. The internal timebase portion 120 includes an incrementer 122, latches 124, and comparator 126.
The outputs from the external and internal timebase portions 110 and 120, are provided to a multiplexer 140. In the PowerPC architecture, an architected register is provided that selects if the system will use an internal or external timebase. Based on this selection, the corresponding portion 110 or 120 is selected via multiplexer 140 to generate a tick.
The timebase value generation portion 130 includes an incrementer 132 and latches 134. The incrementer 132 increments an output timebase value in response to an input from the multiplexer 140. The incrementer 132 of the timebase value generation portion 130 increments the timebase value by one for every tick received. The resulting timebase value is output for use by the microprocessor in measuring an elapsed time, such as for measuring an amount of work done in a frequency independent time period which may then be used to determine whether to scale up or down the operating frequency of the microprocessor.
Using either the external timebase or the internal timebase to measure elapsed time during frequency scaling may lead to inaccurate results. For example, if the internal time-based counter is used, the counter will count slower at lower core clock frequencies and faster at higher core clock frequencies. Thus, the time-based counter circuit 100 will not increment at a constant speed.
Moreover, the core frequency may be scaled to a very low value for maximal power reduction. This frequency value may be smaller than the timebase increased frequency required in the system for accurate time measurement. In this case, the timebase value generation portion 130 will become inaccurate since the internal timebase portion 120, if the internal timebase is selected, cannot generate ticks precisely due to the core clock being too slow. Alternatively, if the external timebase is selected, the external timebase portion 110 will not be able to detect some of the rising edges of the external timebase signal 112 because the core frequency that is used to detect an edge on the external timebase signal 112 is slower than the external timebase signal 112 itself. This again leads to inaccuracy in the timebase value generation portion 130.
One obvious solution is to use a fixed frequency clock. FIG. 2 is a diagram of a time-based counter using a fixed frequency clock in accordance with a known circuit configuration. As shown in FIG. 2, the fixed frequency clock may be derived directly from a phase-locked loop (PLL) 212 in a clock generation portion 210. A PLL is an electronic circuit that controls an oscillator so that the oscillator maintains a constant phase angle (i.e., lock) on the frequency of a reference signal.
In the circuitry shown in FIG. 2, the clock generation portion 210 includes the PLL 212, a divider 214, and a multiplexer 216 for selecting between the fixed frequency clock signal output of the PLL 212 or a divided, or scaled down, output of the PLL 212 that is output by the divider 214. A frequency select signal (freq_sel) is used to select between these two outputs that are provided as a scalable core frequency clock 220.
The frequency select signal is also provided to a multiplexer 230 in the internal timebase portion to select between an 8 cycle input signal and a scaled down 8 cycle input signal, i.e. 8/n. In this way, if the scalable core frequency clock is scaled down, so will the number of cycles be scaled down by the same amount for generation of the internal timebase value. In other words, when the clock frequency is scaled, e.g., halved, via the freq_sel signal, the maximum value for the internal timebase counter is also halved to reflect the change.
In the depicted example, the edge detect circuit 114 of the external timebase portion 110 runs at the fixed frequency generated by the PLL 212, which is a higher frequency than the external timebase clock signal received via the timebase input pin 112. Hence, the rising edge of the external timebase clock signal may be sampled correctly. A problem occurs, however, in that if the core clock frequency becomes lower than the tick frequency of the external timebase portion 110, then some of the ticks produced by the edge detection circuit 114 will be missed in the core frequency clock domain 180 and not added to the timebase value as they should be.
If the internal timebase method is used, as described above, one can change the modulo counter to count only to a portion, e.g., half, of the maximum value, e.g., 4 instead of 8 in the depicted example, when halving the core clock frequency. This would result in a tick being generated at a constant frequency. This option has two main limitations, however. First, one cannot scale the frequency by a larger factor than the maximum value of the modulo counter, e.g., 8. Second, due to the analog delay of the clock mesh that may vary between 0 and 3 core clock cycles, for example, depending on the core clock frequency and chip manufacturing parameters, it is not possible to change the core clock frequency exactly at the same time as the internal timebase multiplexer 230. Hence, if the microprocessor performs a number of frequency scaling operations during normal operation, as is desirable to reduce the power consumption of the microprocessor, one or more ticks will eventually be lost over time.
In other words, if the processor frequency is low such that the analog mesh delay is less than a cycle and the frequency scaling happens when the internal timebase is 0, no ticks would be lost. However, if the processor is running at a very high frequency, the analog delay on the clock mesh, represented in FIG. 2 as the analog mesh delay, exceeds one or more clock cycles. Thus, switching the freq_sel signal when the internal timebase is 0 will result in the clock on the mesh effectively slowing down only one or more cycles later and hence, the timebase counter will not count correctly. With aggressive power management in high frequency microprocessors, the clock mesh frequency will also be reduced below the external timebase update rate. As a result, the core cannot increment its timebase value fast enough and, as with the previous example above, the circuit may miss “ticks.”