Microprocessor designers have increasingly endeavored to improve performance in various microprocessors by increasing clock speeds and adding parallelism. Complex data manipulations require execution of a number of instructions, which may require several iterative cycles for various types of data manipulations. Branch instructions are often used during the iterations. Typically, a branch instructions requires one or more clock cycles, or "delay slots," to resolve a branch address and to fetch the target instruction at the branch address. A delayed branch instruction allows another instruction to be executed during the delay slot(s) of a branch instruction. Microprocessors which have pipelined instruction execution circuitry may provide a delayed branch instruction in order to reduce the number of execution cycles which may be lost due to taking or not taking the branch address within the instruction execution sequence. If a second branch is encountered before the target instruction of the first branch instruction is executed, however, the instruction execution pipeline is stalled in order to preserve the order of execution of instructions.
An object of the present invention is to overcome the performance delay caused by stalling an instruction execution pipeline when a second branch instruction occurs in the delay slot(s) of a first branch instruction.