Modern microprocessors provide support for execution of multiple software threads within the processor at a single time. As an example, some processors may allow two software threads to use the same processor pipeline by interleaving instructions or micro-operations (μops) in the pipeline stages. Some processors may have their pipeline architecture broken up into several sub-pipelines, each associated with a given task, such as instruction decode, allocation, and so forth.
In some architectures, one or more such sub-pipelines may be stalling, in that if a given instruction or μop needs a particular resource or resources, the associated instruction or μop may stall in a given pipestage of the sub-pipeline until the needed resource become available. By stalling the pipeline, forward progress of other instructions or μops behind the stalled one are also prevented from making forward progress. Accordingly, in some architectures, an entire sub-pipeline may be replicated from a beginning buffer to an ending buffer, along with all pipestages there between. However, such replication consumes significant hardware.