As computer speed increased from 33 mHz to 1.2 GHz and beyond, the computer operations could not be completed in one cycle. As a result the technique of pipelining was adopted to make most efficient use of the higher processor performance and to improve their throughput. Presently deep pipelining uses as many as 25 stages or more. Generally, in a pipelined computing system there are several parallel building blocks working simultaneously where each block takes care of different parts of the whole process for example, there is a compute unit that does the computation, an address unit including a data address generator (DAG) that fetches and stores the data in memory according to the selected address modes and a sequencer or control circuit that decodes and distributes the instructions. The DAG is the only component that can address the memory. Thus in a deeply pipelined system if an instruction is dependent on the result of another one, a pipeline stall will happen where the pipeline will stop, waiting for the offending instruction to finish before resuming work. For example, if, after a computation, the output of the computing unit is needed by the DAG for the next data fetch, it can't be delivered directly to the DAG to be conditioned for a data fetch: it must propagate through the pipeline before it can be processed by the DAG to do the next data fetch and computation. This is so because only the DAG has access to the memory and can convert the compute result to an address pointer to locate the desired data. In multi-tasking general purpose computers this stall may not be critical but in real time computer systems such as used in e.g., cell phones, digital cameras, these stalls are a problem.