Computing systems and microprocessors frequently support multiprocessing, for example, in the form of multiple processors, or multiple cores within a processor, or multiple software processes or threads (historically related to co-routines) running on a processor core, or in various combinations of the above.
In modern microprocessors, many techniques are used to increase performance. Pipelining is a technique for exploiting parallelism between different instructions that have similar stages of execution. These stages are typically referred to, for example, as instruction-fetch, decode, operand-read, execute, write-back, etc. By performing work for multiple pipeline stages in parallel for a sequence of instructions the effective machine cycle time may be reduced and parallelism between the stages of instructions in the sequence may be exploited. In some modern microprocessors, these stages may also be divided into increasingly smaller time slices to further reduce the effective machine cycle time.
Branch prediction is another technique used to increase performance. When a branch instruction occurs in a sequence of instructions, the outcome of the branch decision may not be known until the branch instruction is executed in one of the later pipeline stages. Thus bubbles may be introduced into the pipeline until it is determined which branch target instructions need to be fetched. Rather than waiting until the outcome of the branch decision is known, the branch may be predicted as taken or not taken, and instructions of the predicted target may be fetched from memory, thus reducing bubbles in the pipeline.
The technique of executing multiple software processes or threads on a microprocessor may also be used to reduce the occurrence of bubbles in a pipeline. For example, when an instruction cache miss occurs for one particular execution thread, instructions from another execution thread may be fetched to fill the pipeline bubbles that would otherwise have resulted from waiting for the missing cache line to be retrieved from external memory.
On the other hand, conditions such as the exhaustion of some particular type of internal microprocessor resources may cause one of more of the pipeline stages to stall. While one execution thread is stalled in the pipeline, progress of other threads in the pipeline may also be blocked, thus reducing the effectiveness of executing multiple threads on a pipelined microprocessor.