A vast amount of research and system architecture design efforts are directed to increasing data throughput within computer systems. Technologies, such as data pipeline, out-of-order execution and the like, enable advanced architectures in processing with significantly higher clock rates to achieve world class performance. Furthermore, this research, as well as architecture redesign, has enabled the mobile market for laptop computers, hand-held devices, personal digital assistants (PDAs), and the like.
Unfortunately, such mobile platforms may be limited to a run-time dictated by the life of a battery used by the respective mobile platform when another power source is unavailable. Depending on the complexity of the mobile platform, power resources from an attached battery may be depleted within a relatively short amount of time. Furthermore, inclusion of technologies, such as data pipeline, out-of-order execution and the like within a mobile platform generally results in the consumption of inordinate amounts of power during execution. Hence, high performance mobile platforms may not provide a user with a sufficient amount of mobile operation time.
Current Intel® Architecture (IA) Processor Families (IA-32 and IA-64) provide various performance monitors to record information, such as cache miss, branch miss prediction, retired instructions, and the like, with very little overhead, to the executing program. Compilers can also install operating system drivers to record various performance monitor information. In addition, the performance monitoring information is used for the next program compilation to speed-up the code based on a period of typical use. In the past, performance monitors have helped both programmers and compilers to refine generated program code without resorting to traditional probing code that causes substantial overhead or alters program characteristics to render measured statistics unusable.
Unfortunately, in the area of low-power programming, performance monitors for pinpointing portions of an application program that consume more power than remaining portions of the program do not exist. Conventional compilers cannot collect power consumption information of a processor without help from the processor. Hence, without adequate tools, researchers often rely on some low power principles in order to promote their programming or computing strategies as requiring low power. Such practices often present inaccurate accounts of what really happens in the processor. Researchers often correlate low power to performance. Consequently, most performance enhancing operations that achieve the same throughput with less time are erroneously labeled as low-power technologies.