Many programming languages, operating systems, and other software development environments support “threads of execution,” or, as more commonly called, “threads.” Threads are similar to processes, which are tasks that take turns running on the central processing unit (cpu) of a computer. Threads are similar in that both represent a single sequence of instructions executed in parallel with other sequences, either by time-slicing or multiprocessing. Threading, itself, however, is a technology that, when present, enables the splitting of a program's tasks into two or more simultaneously running processes, and thereby, generally accelerates processing by the cpu.
Looking first at conventional time-slicing, also known as multitasking, this occurs when multiple processes share common processing resources, such as the cpu. At any point in time, only one of these tasks is running, i.e., the cpu is actively executing instructions for the process. The operating system may choose at any given moment to elect another process for running.
Now looking at multiprocessing, this is the use of multiple concurrent processes in a system as opposed to a single process at any one instant. Like multitasking, which allows multiple processes to share a single cpu, multiple cpus may be used to execute threads within a single process. Multitasking for general tasks is often fairly difficult because various programs holding internal data, known as state. Essentially the programs are typically written in such a fashion that they assume their data is incorruptible. However, if another copy of the program is running on another processor, the two copies can interfere with each other by both attempting to read and write their state at the same time. A variety of programming techniques are used to avoid this problem, including semaphores and other checks and blocks which allow only one copy of the program to change such values at a time. Another problem is that processors often use a speed-increasing technique known as caching in which small pools of very fast memory are associated with each processor in order to allow them to work with temporary values very quickly. This can lead to a situation in which each processor is working in a separate cache, rather than in the shared memory; changes to a processor's local cache will not be communicated to other processors until the contents of the cache are written to shared memory. This cannot be helped via programming techniques because it is invisible to the programs themselves. In this case the problem requires additional hardware in order to make sure that all caches on the various processors are up to date, and synchronized with one another.
With the introduction of virtual memory it became useful to distinguish between multitasking of processes and threads. Tasks which share the same virtual memory space are called threads. Threads are described as lightweight because switching between threads does not involve changing the virtual memory context. Processes were distinguished by the fact that each had its own virtual memory space, so that it appeared to have the entire memory to itself, and could contain multiple threads running in that memory. Operating system functions are typically mapped into each virtual address space and interrupt handling typically runs in whichever memory context is in place when the interrupt occurs, so programs are still vulnerable to malfunctioning system code.
A common use of threads is having one thread paying attention to the graphical user interface while others do a long calculation in the background. As a result, the application more readily responds to user's interaction. An advantage of a multi-threaded program is that it can operate faster on computer systems that have multiple cpus, or across a cluster of machines.
Operating systems generally implement threads in either of two ways: preemptive multithreading or cooperative multithreading. Preemptive multithreading is generally considered the superior implementation, as it allows the operating system to determine when a context switch should occur. Cooperative multithreading, on the other hand, relies on the threads themselves to relinquish control once they are at a stopping point. This can create problems if a thread is waiting for a resource to become available. The disadvantage to preemptive multithreading is that the system may make a context switch at an inappropriate time, causing priority inversion or other bad effects which may be avoided by cooperative multithreading.
Hardware support for software threads is provided by simultaneous multithreading (SMT). SMT technology enables multi-threaded software applications to execute threads in parallel. This level of threading technology has never been seen before in a general-purpose microprocessor. Internet, e-business, and enterprise software applications continue to put higher demands on processors. To improve performance in the past, threading was enabled in the software by splitting instructions into multiple streams so that multiple processors could act upon them. Today with SMT technology, processor-level threading can be utilized which offers more efficient use of processor resources for greater parallelism and improved performance on today's multi-threaded software.
SMT technology provides thread-level-parallelism (TLP) on each processor resulting in increased utilization of processor execution resources. As a result, resource utilization yields higher processing throughput, minimized latency, and minimized power consumption. SMT technology know also permits multiple threads of software applications to run simultaneously on one processor. This is achieved by duplicating the architectural state on each processor, while sharing one set of processor execution resources. SMT technology also delivers faster response times for multi-tasking workload environments. By allowing the processor to use on-die resources that would otherwise have been idle, SMT technology provides a performance boost on multi-threading and multi-tasking operations.
This technology is largely invisible to the platform. In fact, many applications are already multi-threaded and will normally and automatically benefit from this technology. However, multi-threaded applications take full advantage of the increased performance that SMT technology has to offer, allowing users will see immediate performance gains when multitasking. Today's multi-processing aware software is also compatible with SMT technology enabled platforms, but further performance gains can be realized by specifically tuning software for SMT technology. This technology complements traditional multi-processing by providing additional headroom for future software optimizations and business growth.
Despite advantages often obtained in processor performance through SMT, problems remain. Recent state of the art reports indicate that always enabling SMT is not always beneficial. In fact, performance of some applications with SMT enabled results in detrimental effects; performance is known to drop by as much as half. What is needed, therefore, are methods, systems, and media for determining beneficial enablement of SMT for various workloads, and to do so autonomically in an effort to remove reliance on a system administrator.