1. Technical Field
This relates to providing a core centric view of hardware threads and associated caches. More specifically, this relates to measuring and assessing a processor core from the perspective of individual hardware threads and stall categories.
2. Description of the Prior Art
A processor core is the processing part of a central processing unit absent the cache. The core is made up of a control unit and arithmetic logic unit. The control unit is the hardware within the processor that performs physical data transfers between memory and a peripheral device. The arithmetic logic unit is a high-speed circuit that performs calculations and comparisons. Numerical data is transferred from memory to the arithmetic logic unit for calculation, and the results can be sent back to the memory.
Multithreaded processor cores execute multiple hardware threads concurrently on a single processor core. Each processor thread is typically presented to the operating system as a hardware entity that can execute a software process or thread. The operating system is responsible for scheduling software threads for processing by the core(s) and their hardware threads. It is known in the art for operating systems to report utilization of hardware threads as central processing units.
While the use of multiple hardware threads tends to allow cores that support them to have higher total throughput per core than they would when running a single hardware thread per core, it is known that there is interference among the threads of the multi-threaded core. This interference can impact performance of the core and mitigate the benefits of the multiple threads operating on the core. Accordingly, there is a need to mitigate conflicts among the multiple threads and effectively and efficiently assign tasks to the threads in a manner that mitigates interference.