In a computing environment, a pipeline is a series of functional units, or processors, which perform a task in several steps. Each processor takes inputs and produces outputs which are stored in a buffer. One processor's output buffer is therefore the next processor's input buffer. This arrangement allows processors in a pipeline to work in parallel thus giving greater throughput than if each input had to pass through a whole pipeline before the next input could enter. The processor outputting to the buffer is usually termed the “producer” and the processor receiving input from the buffer is usually termed the “consumer”. Pipelines will generally have a plurality of and usually many producer processor/consumer processor pairs.
Pipelines are used for many algorithms, including particularly (but not exclusively), algorithms used in the imaging, audio/video and wireless domains. These algorithms usually have to be implemented for very high performance. Pipeline processors for high performance operation are often implemented in hardware, and each of these hardwired processors is known as a non-programmable processor.
All processors in a pipeline system are scheduled and hardwired to work in synchronicity (termed synchronous scheduling) with each other to meet the data dependencies between them. Intra-processor controllers handle the execution within each processor. A single inter-processor or pipeline controller handles the system level control such as starting tasks on the processors at specified times and reporting the readiness of the pipeline to accept the next task.
At runtime, operations may generate stalls to some processors. A stall will take a processor out-of-sync with other processors resulting in incorrect execution. Such synchronisation variations are a significant problem for pipelines.
One solution for the pipeline controller is to operate a common or single stall domain for the pipeline that stalls every processor for a stall to any one processor. This has the effect of stopping operation of pipeline throughput while the processors are stalled and brought back into sync with each other. This approach has a severe negative impact on throughput of the pipeline, and can drastically reduce overall performance. A single stall domain solution also requires a special “flush” command and circuitry to flush the last tasks from the pipeline.