In long pipelined Digital Signal Processing (DSP) processors, load-store and pointer-arithmetic operations are executed by address modules and especially by pipeline stages of address module that precede the execution stage of a data module that includes an arithmetic-logic-unit (ALU).
If a load-store operation depends on an ALU-product (for example—depends upon a condition that is represented by a predicator-bit, such as a result of a compare operation), many stalls might be inserted in order to delay the execution of the load-store and pointer-arithmetic operations till after the ALU-product is provided. In many cases multiple load-store and pointer-arithmetic instructions are conditioned by a result of an ALU operation.
For example, the SC3400 DSP processor of Freescale of Austin Tex., USA inserts five stall cycles between an ALU-compare instruction to a conditional memory access. In the example below, five stall cycles are inserted between instructions I1 and I2:
I1 cmp d1,d2 {compare the values of data registers d1 and d2}
I2 iff adda r2,r3 {if d1 differs from d2 then sum the values of address registers r2 and r3 and store the result at r3}
I3 ift move r3, ($1000) {if d1 equals to d2 then move address register r3, to the memory at address $1000}