Pipelining is a computer architecture implementation technique in which the execution of adjacent machine instructions is overlapped in time. Pipelining has been and continues to be the main technique for creating fast central processing units (CPUs), since it requires no change to compilers, linkers or programming techniques and yield is a tremendous performance improvement.
A pipelined CPU divides the execution of machine instructions into small, discrete steps or stages, whereas each stage can use only one cycle to execute. Each stage performs a fixed task or set of tasks on the machine instructions. Each stage expects that the tasks done by the preceding stage have been correctly and completely performed. The time required to do the tasks at each stage is less than the time it would take to completely execute the machine instruction. Therefore, it is noted that more stages will get better timing and get higher speed CPU.
In order to make pipelining work efficiently, it is necessary to keep all the stages full. However, there is a branch prediction problem whenever an instruction is encountered that alters the sequential flow of control in the program. If statements, loop statements, and procedure statements, which are BRANCH or JUMP instructions that requires one or more results being computed by the preceding instruction(s), cause problems with the pipeline. Consider the following code:
IF (condition A)                Execute procedure B        
ELSE                Execute procedure CIt is note that a pipelined CPU must execute procedure B or C based on the result of condition A. But the pipelined CPU can't wait the result of condition A and then get the procedure B or C instructions into pipelined stage. Thus, the pipelined CPU must “prediction” the result of condition A and read one of the procedure B or C instructions into pipeline, otherwise many stages in the pipeline will be idle as the CPU waits the result of condition A. If the CPU predicts condition A is true and reads procedure B instructions into pipeline while condition A is really true, then the CPU “hits” the result and finishes the above IF-ELSE procedure quickly. But if the CPU predicts condition A is true and reads procedure B instructions into pipeline while condition A is false, then the CPU “misses” the result and it must stop the execution of procedure B in the pipeline and get the procedure C instructions into pipeline to be executed, which will waste a lot of time. By virtue of this, although pipelining is the technique for creating fast CPU, it does not get equivalent performance.        
Therefore, there is a need for an improvement for branch execution in pipelined CPU by multiple pipelines.