In typical hardware implementations of a computer instruction set, it has been established that conditional branch instructions use a relatively high number of processing cycles, making their usage expensive.
Branch processing overhead is especially evident when a sequence of tests (herein termed a multi-test) is performed. Additional overhead occurs because each test is logically associated with a separate block of code to be executed if the result of the test is TRUE. When the result is TRUE, a conditional branch is issued to a label preceding the block of code. At the end of the block of code an additional branch is issued to a label associated with the end of the code for the multi-test.
This can be seen in the following example.
Given the following pseudo-code:
If (a>b) & (a>c) & (a>d) then do block 1
If (a>b) & (a>c) & (a<=d) then do block 2
If (a>b) & (a<=c) & (a>d) then do block 3
If (a>b) & (a<=c) & (a<=d) then do block 4
If (a<=b) & (a >c) & (a >d) then do block 5
If (a<=b) & (a >c) & (a <=d) then do block 6
If (a<=b) & (a <=c) & (a >d) then do block 7
If (a<=b) & (a <=c) & (a <=d) then do block 8
Table I is a listing of an assembly implementation of this pseudo-code, typical of the prior art:
TABLE ITest A,BJmp if greater to label1Test A,CJmp if greater to label2Test A,DJmp if greater to label3; place code for case of !(A>B) and !(A>C) and!(A>D) here::jmp endlabel1:Test A,CJmp if greater to label11Test A, DJmp if greater to label12; place code for case of (A>B) and !(A>C) and !(A>D)here::jmp endlabel11:Test A,DJmp if greater to label111; place code for case of (A>B) and (A>C) and !(A>D)here::jmp endlabel111:; place code for case of (A>B) and (A>C) and (A>D)here::jmp endlabel112:; place code for case of (A>B) and !(A>C) and (A>D)here::jmp endlabel2:Test A, DJmp if greater to labe21; place code for case of !(A>B) and (A>C) and !(A>D)here::jmp endlabe21:; place code for case of !(A>B) and (A>C) and (A>D)here::jmp endlabel3:; place code for case of !(A>B) and !(A>C) and (A>D)here::end:
There are seven conditional branch instructions required in this sample code listing. A conditional branch instruction not taken typically uses one cycle, while a taken branch instruction typically uses three cycles, making branch instructions relatively expensive in terms of processor cycles. A multi-test typically leads to a high number of branch instructions and a system that reduces the number of branches would yield significant processing savings.
Furthermore, the problem of large numbers of branches is particularly acute in processors that implement pipelining, wherein multiple instructions are processed in parallel. While pipelining offers significant performance advantages for most of the instruction set, performing branch instructions in pipelines can degrade performance.