Dynamic voltage scaling (DVS) is a recent low-power technology for adjusting the processor speed according to the workload. When the system workload is low, the processor may operate at a lower voltage and clock frequency to save power. The goal of the DVS mechanism is to save total energy consumption while satisfying the efficiency demands by lowering the operation speed.
FIG. 1 shows a schematic view of an exemplary DVS. In FIG. 1, the upper figure is a curve of the workload vs. time. According to the workload curve, the operating system (OS) scales the CPU speed. The lower figure is a schematic view of the CPU operating at different voltage and frequency according to the different workload.
The realization of DVS system may be divided into two parts. The first part is the circuit technology to dynamically scale the voltage and operating frequency, including glitch-less clock generator, phase-lock loop (PLL), and closed-loop voltage adjustment circuit. The second part is to match the performance setting of DVS by determining the CPU operation speed according to the workload. The algorithms of performance setting methods of the second part may be divided into three types. The first type is to determine the performance setting according to the usage context. The second type is to set performance according to the task deadline of a real-time kernel. The third type is to monitor past utilization to set the processor speed.
The LongRun technology in Transmeta Crusoe and the ARM intelligent energy management technology both use the third type of monitoring past utilization of the processor. This method records the past utilization of the processor as the basis for scaling the CPU operating frequency. The algorithm of this type is complicated and consumes much computation resource, and is thus not suitable for wireless sensor network (WSN) devices.
Many algorithms for DVS performance setting have been proposed. However, proposed algorithms may consume longer run time and need system resource that usually does not exist in a WSN-node. For a WSN-node with limited resource, the current DVS technology is not suitable because the WSN-node usually has only a simple micro-processor unit (MCU) and a small amount of memory, and does not even include a complete OS.
U.S. Pat. No. 7,131,015 disclosed a performance setting method for DVS proposed by ARM. The performance setting method uses the OS to detect a series of related events during execution, called an episode, and predicts the performance factor (PF) required for executing the episode according to historical record of a performance factor required by executing that episode.
Performance factor means the ratio of the current execution speed and the highest speed. For example, the highest speed of the CPU is 100 MHz, and the current clock rate of the CPU is 80 MHz, the PF is 0.8.
FIG. 2 shows an example illustrating the occurrence of an episode 200. As shown in FIG. 2, a user activates a ghostview window to read a postscript file. This event triggers a series of related events, including the system call of the OS to access an attached file, wake up a ghostview program to parse the attached file and render the edited document, and then activate the X-window server process to display the ghostview window. This series of events is an episode.
The performance setting method for DVS by ARM must modify the OS, and must use the intercept system call to dynamically detect the episode for the target of the performance setting. This method targets the past episodes to calculate the required PF, and then uses the historical record to predict the required PF in future execution of the episode. ARM will calculate the required PF after each episode execution. The equation for calculating performance factor PFj is as follows:
      PF    j    =                    Time                  full          -          speed                    -              Time        idle                    PerceptionThreshold      -              Time        idle            where variable PerceptionThreshold may be viewed as the deadline for finishing the episode. In this example, variable PerceptionThreshold is set as 0.5 ms. When an episode is executed again, the system predicts required PFprediction according to PFj of the last n executions and the required execution time Timej at full-speed. The prediction equation is:
      PF    prediction    =                    ∑                  j          =          1                n            ⁢                        PF          j                *                  Time          j                                    ∑                  j          =          1                n            ⁢              Time        j            
In this example, the performance setting method for DVS by ARM is to determine, during the execution, the required execution time is at least 20 ms at the CPU speed, and to detect the episode during the execution. The calculation of PF requires a floating-point adder, a multiplier and a divider. In other words, it requires a large amount of hardware and consumes much computational resource, and is not suitable for WNS-node with only limited resource.
R. Xu, et. al, published “Minimizing expected energy in real-time embedded system” in ACM International Conference on Embedded Software 2005, disclosing a theory to minimize the expected energy in a real-time embedded system. In the offline stage, the profiling approach is used to collect parameters Wi and Pi(x) for each task Ti, where parameter Wi is the worse-case execution cycle of Ti, i.e. the maximum execution cycles that task Ti may encounter, and parameter Pi(x) is the probability that task Ti executes x cycles. According to parameters Wi and Pi(x), a mathematical programming approach is used to solve the control parameter βi to control the execution of task Ti at run time.
FIG. 3 shows an exemplary schematic view of the task model of the theory. The task model is designed for repeated periodic tasks. In each period, a series of tasks {T1, T2, . . . , Tn} are executed sequentially, and are required to complete before the deadline D.
FIG. 4 shows a schematic view of an exemplary operation of the task model. When executing a task Ti, the time allocated to task Ti is βi*D′, the ratio βi of the remaining time to deadline D′. The required clock frequency fi is set to guarantee that, even the worse case execution cycle Wi is encountered, the task Ti may still be completed in time βi*D′. Clock frequency fi is set as follows:
      f    i    =            W      i                      β        i            *              D        ′            