Simultaneous multithreading (SMT) is a technique for improving the overall efficiency of central processing units (CPUs) with hardware multithreading. SMT permits multiple independent threads of execution to better utilize the resources by modern processor architectures. A thread of execution is the smallest sequence of programmed instructions that can be managed independently. Multiple threads can exist within the same process and share resources such as memory, while different processes usually do not share these resources.
On a single processor, multithreading generally occurs by time-division multiplexing (as in multitasking): the processor switches between different threads. This context switching generally happens frequently enough that the user perceives the threads or tasks as running at the same time. On a multiprocessor (including multi-core system), the threads or tasks will actually run at the same time, with each processor or core running a particular thread or task.
There are many applications that may benefit from SMT threads. For instance, graphic shaders, web servers, transaction processing applications (TPC), scientific applications, high-performance computing (HPC) applications, or the like have many software threads and benefit from more SMT hardware threads due to the presence of high-latency computer operations, frequency data cache misses, or both. Further, in-order processors, or even weakly out-of-order processors, benefit from multiple SMT threads since multiple SMT threads enable higher throughput.
However, a normal x86 thread, for example, has a large amount of context associated with it (also referred to as x86 state) (e.g., 16 general purpose registers, 32 AVX3 registers, segment registers, MMX/x87 registers, control registers (CR1-CR3) debug registers, 10's of model specific registers (MSRS), or the like). This may prevent many processors from supporting many hardware SMT threads due to the complexity in providing support for all the x86 state. For instance, an additional thread requires expansion of physical register files, rename tables, translation lookaside buffer (TLB) thread identifiers (IDs), scratch pads for saving register state, etc.