Complex digital processors conventionally include one or more processing pipelines in which data is processed at an architecturally appropriate clock speed. A processing pipeline may implement certain operations over several pipeline stages to beneficially achieve relatively high clock speeds and processing throughput. A given pipeline stage may need to stall inbound data if the pipeline stage is not ready to receive the inbound data. The same pipeline stage may need to stall itself if a subsequent pipeline stage is not ready to receive new data. Such stalling behavior may arise for many reasons. For example, a pipeline may need to stall while waiting to access a shared resource, such as an external memory.
A “Ready” signal and a “Valid” signal conventionally implement a flow-control protocol for data being transmitted from a source pipeline stage to a destination pipeline stage. The data is transmitted as a separate signal from the Ready signal and the Valid signal. The data is allowed to progress from the source pipeline stage to the destination pipeline stage if the Ready signal generated by the destination pipeline stage is true and the Valid signal generated by the source pipeline stage is true. If the Ready signal is false, then the destination pipeline stage is stalled and not able to receive the data. When the destination pipeline stage is stalled, the source pipeline stage needs to stall and hold the data when the Valid signal is true. If the Valid signal is false, then corresponding data held in the source pipeline stage is not valid data. This condition is referred to as a bubble. A bubble may be collapsed at the output of the source pipeline stage when the Valid signal is true and the source pipeline stage generates a true Ready signal to accept Valid data at the input of the source pipeline stage.
Data conventionally progresses through all pipeline stages of a given processing pipeline based on a common clock signal in accordance with synchronous design principles. For example, data may progress through an stages of a processing pipeline in lock-step on every positive edge of the clock signal when each pipeline stage generates a true Ready signal. An on-chip power distribution network supplies power to each circuit element within the pipeline. The power distribution network may be characterized as having both distributed inductance and distributed capacitance interposed between an external power source and each circuit element.
In certain scenarios, the processing pipeline is stalled at the output by a interface unit, which may be waiting to access a particular resource. When multiple pipeline stages in the processing pipeline each generate true Valid signals and the interface unit is stalled, a false Ready signal propagates back through each pipeline stage as each pipeline stage generates a false Ready signal until a bubble is reached at the output of a pipeline stage. Therefore, multiple pipeline stages become idle in the same clock cycle and the processing pipeline circuitry consumes less power compared with the previous clock cycle when the processing pipeline was active.
When the interface unit is ready to accept data from the processing pipeline, each of the pipeline stages becomes active and computes new results. A true Ready signal propagates back through each pipeline stage as each pipeline stage generates a true Ready signal. Therefore, multiple pipeline stages become active in the same clock cycle and the processing pipeline circuitry consumes more power compared with the previous clock cycle when the processing pipeline was idle. The sudden change from an idle processing pipeline to an active processing pipeline can cause a relatively sharp spike in current demanded from the power distribution network. Because each pipeline stage operates synchronously to the clock signal, the spike in current is highly correlated over all pipeline stages, which may lead to a transient voltage droop in the power distribution network. The voltage droop may degrade the reliable operating frequency for circuitry within the processing pipeline, leading to reduced system performance.
Thus, there is a need for addressing this issue and/or other issues associated with the prior art.