Pipeline data processing systems are well known in the art, e.g., see U.S. Pat. Nos. 4,594,655 and 4,646,236 and the references discussed therein. Pipelining techniques are especially well known in digital signal processor (DSP) technology. Typically, a pipeline processor enhances execution speed by separating the instruction processing function into three pipeline phases. This phase division allows an instruction to be fetched (F), while a previous instruction is decoded (D), and an instruction before that is executed (E). As shown in FIG. 1, the total elapsed time to process (i.e., fetch, decode and execute) a single instruction is three machine cycles. However, the average throughput is one instruction per machine cycle because of the overlapped operations of three pipelined phases. This processing speed improvement is the motivation for using pipelined architectures, but along with this benefit come a number of limitations.
In particular, pipelined processors are more difficult to program due to constraints forced by the hardware. By way of example, the following restrictions are typical.
(1) An instruction in a standard serial sequence program disposed after a branch instruction is always executed because the instruction is fetched before the branch instruction is decoded as a branch (refer to FIG. 1 wherein D is assumed to decode to be a branch instruction). Because of phase division, the instruction designated by F.sup.* is brought into the pipelined processor contemporaneous with decoding of branch instruction D. In the next two machine cycles, this instruction F.sup.* is decoded D.sup.* and then executed E.sup.*, i.e., unless otherwise overridden. Sometimes a useful instruction can be placed subsequent a branch instruction, but finding such an instruction can be difficult and the readability of the program can be damaged. Often, this position is simply padded with a no operation (NOP) instruction.
(2) For the same reason explained above with respect to a branch instruction, an interrupt typically forces a NOP instruction after the interrupt vector is processed.
(3) If an index register is to be altered (incremented or loaded) by an instruction, it cannot be used to generate an address for the following instruction. This is because the corresponding register is updated during phase three of the pipeline processing and is therefore unavailable in the same machine cycle (i.e., at the beginning of phase two thereof) to generate an address. Again, a NOP instruction may be required to ensure that the correct index is used.
(4) Conditional branch operations can have similar problems to those noted above and the same limitations since conditions are typically the result of phase three ALU processor operations.
In each example discussed, the processing problem arises from a need for a time delay of one machine cycle in order to stabilize the result of a prior instruction before that instruction result can be used by a subsequent instruction. Pursuant to the present invention, all of the above-noted constraints are advantageously eliminated by structuring the pipelined processor to alternately handle multiple instruction pathlengths substantially simultaneously in a time division manner.