Modern programming languages and operating systems often allow the use of multithreading, or threads. A thread is a separate path of execution within a single process or program running on an operating system. Various schemes of threading are used, some differing on whether the threads are based in user space or in kernel space. Also, the performance of threaded programs may depend on the target hardware. On multi-processor computers, the threads may be allocated to different central processing units for true simultaneous execution. On a single processor computer, a scheduler prioritizes and schedules instructions between various thread within a process, which gives the appearance that the threads are running simultaneously.
Generally, threads of execution may share memory and processing resources of the parent process. By programming with threads, developers are able to take advantage of parallel computing models without the complexities inherent, for example, in inter-process communications. Programming with threads still requires special precautions, such as synchronization of data and dealing with race conditions. Nonetheless, threads may be an optimal solution for many data processing tasks.
Another complexity inherent in the use of threads is determining processor usage by the threads. When trying to improve a program's performance, it is often necessary to determine how time is spent by the processor in executing code. By identifying parts of a program (e.g., method, thread, etc.) that require the most amount of time to execute, the programmer can more efficiently focus on optimizing those parts first.
Performance monitoring tools are often used to provide processor usage time of various parts of code. This type of tool is referred to as a profiler. Profiling is a process in which specific information about the dynamic execution of a program is collected. Such information often includes execution times of individual components of programs, such as functions, methods or loops. Profilers typically work by determining CPU time used by a process, along with other performance information, for particular code regions.
Profilers are generally useful in characterizing performance based on function calls within a single-threaded or multi-threaded application. It is a common practice for profilers to collect data separately for each thread. However, profilers do not always provide accurate performance measurements on a thread-by-thread basis.
Some profiler implementations use a simple approach of determining CPU time of a thread by examining the CPU clock time before and after a thread's code region is executed. This approach determines the amount of time the process spent on that code region. This approach, however, fails to account for the fact that the thread may have been idle for some period of time between the beginning and end of the code region. This idle time may be the result of a scheduler switching in code regions of another thread that is concurrently running, only later to resume executing the thread being measured. The thread might also be idle while it or some other task is waiting on input/output (I/O) to complete, such as that which occurs via processor interrupts.