A multi-core processor architecture is implemented by a single processor that plugs directly into a single processor socket, and that single processor will have one or more “processor cores”. Those skilled in the art also refer to processor cores as “CPU cores”. The operating system perceives each processor core as a discrete logical processor. A multi-core processor can perform more work within a given clock cycle because computational work is spread over to the multiple processor cores.
Hardware threads are the one or more computational objects that share the resources of a core but architecturally look like a core from an application program's viewpoint. As noted above, a core is the one or more computational engines in a processor. Hardware multithreading (also known as HyperThreading) is a technology that allows a processor core to act like two or more separate “logical processors” or “computational objects” to the operating system and the application programs that use the processor core. In other words, when performing the multithreading process, a processor core executes, for example, two threads (streams) of instructions sent by the operating system, and the processor core appears to be two separate logical processors to the operating system. The processor core can perform more work during each clock cycle by executing multiple hardware threads. Each hardware thread typically has its own thread state, registers, stack pointer, and program counter.
With a multithreaded processor core, the ability to accurately measure the processor core utilization and/versus the logical processor utilization is hampered or deficient. This problem applies whether the multithreaded processor architecture provides a shared Interval Timers Counter (ITC) or dedicated per hardware thread ITCs. The ITCs provide a time interval for counting the processor cycles (CPU execution time) that are consumed by a hardware thread. For example, in a multithreaded processor core with two sibling hardware threads, the measured utilization for the multithreaded core may be at 100% utilization, with one hardware thread utilizing 100% of the processor cycles (i.e., this hardware thread does not give up its processing cycles to the other hardware thread by issuing yield operations operation such as hint@pause and PAL_HALT_LIGHT) and the second hardware thread being idle. Since the second hardware thread is not fully utilized, the total core usage and throughput are not maximized. Additional complexity is also introduced when accounting for the hardware thread scheduling yield operations. For example, as discussed in commonly-owned U.S. patent application Ser. No. 11/796,511 (U.S. Patent Publication 2008/0271027), by Scott J. Norton and Hyun Kim, entitled “FAIR SHARE SCHEDULING WITH HARDWARE MULTITHREADING”, which is hereby fully incorporated herein by reference, the secondary hardware thread will execute a yield operation if a task in a fair share group is not found for execution by the secondary hardware thread. The yield operations by the secondary hardware thread results in a decrease of logical processor utilization.
For capacity planning purpose and load distribution algorithms to work properly, it is important to accurately measure the processor core utilization and logical processor utilization (i.e., hardware thread utilization), in order to distinguish an idling logical processor. It is possible for the processor core be 100% utilized, but one or more logical processors may be idle, as discussed above. However, prior methods do not provide an accurate measurement of core utilization and/versus logical processor utilization in a multithreaded processor core.
Therefore, the current technology is limited in its capabilities and suffers from at least the above constraints and deficiencies.