The present invention relates to the accounting of resources in a microprocessor, particularly to a microprocessor supporting simultaneous multi-threading (SMT) which is used as Central Processor Unit (CPU) in a computer system.
SMT is the ability of a single physical microprocessor to concurrently dispatch instructions from more than one hardware thread. For example, two hardware threads can run on one physical processor at the same time. SMT is a good choice when overall throughput is more important than the throughput on an individual thread of execution. For example, web servers and database servers are good candidates for being executed on servers with CPUs supporting SMT.
The operating systems available on some server-class computing hardware such as IBM's System P and System I offer exact CPU accounting based on the ticks of a timebase register. This feature allows charging accurately for the CPU time used, a feature widely used by data centers and computing utilities. A special scenario is the case when performance throttling is used in order to decrease the maximum microprocessor frequency below the nominal microprocessor frequency. This allows selling or leasing computer systems in different price ranges without any changes to the actual computer system hardware itself.
With the introduction of SMT to CPU architectures, simple use of the timekeeping hardware is no longer sufficient because the SMT mechanism allocates processing resources to competing hardware threads on a very fine-grained basis, for example, at each instruction dispatch clock cycle in the CPU.
The IBM POWER5 processor architecture introduced a special-purpose register (SPR) per hardware thread for tracking the CPU time allocated to each hardware thread. The exploitation of this SPR by operating systems is described in P. Mackerras et al. “Operating system exploitation of the POWER5 system”, IBM J. RES. & DEV., Vol. 49, No. 4/5, 2005, pp. 533-539. This SPR is called PURR, the Processor Utilization of Resources Register. There is one PURR for each hardware thread that contains data specific to that particular thread. The PURR is defined to be 64 bits long. It is writeable in privileged state with the so-called hypervisor bit on (HV=1), readable in privileged state and inaccessible in problem state. This definition allows a hypervisor to virtualize the PURR for the operating systems by saving and restoring it on context switch. For example, this is done by the IBM standard hypervisor for POWER5 PHYP (IBM POWER Hypervisor). Regular operating systems such as IBM AIX and Linux can only read the PURR.
The hardware increments for PURRs are based on how each thread is using the resources of the processor, including the dispatch clock cycles that are allocated to each thread. For a clock cycle in which no instructions are dispatched, the PURR of the thread that last dispatched an instruction is incremented. The register advances automatically so that the operating system can always get the current up-to-date value.
Many new generation computing systems require active power and thermal management in order to function correctly, maintain their stability, reduce operating costs and extract maximum performance. Power consumption and the associated heat generation of modern microprocessors are key design issues in the development process of computer systems, as microprocessors are major consumers of power and sources of heat in computer systems. Many mechanisms exist in contemporary microprocessors to vary its power consumption, which try to counteract thermal stress and over temperature due to increased power consumption and heat dissipation. Often these techniques alter processing operating characteristics of the microprocessors in the computer system to control power and temperature.
Many such methods are often explicitly decreasing or increasing microprocessor clock frequency and voltage, hence the name dynamic frequency and voltage scaling (DVFS) methods (also known as slewing). Other known methods are pipeline throttling and IPC (Instruction Per Cycle) throttling (also known as IPC clipping or limiting). Pipeline throttling divides the available clock cycles into windows with a fixed number of hold or dead clock cycles. A microprocessor core can throttle by limiting the