Conditional execution of instructions is a conventional feature of processing systems. An example is a conditional instruction, such as a conditional branch instruction, where the direction taken by the conditional branch instruction may depend on how a condition gets resolved. For example, a conditional branch instruction may be represented as, “if <condition1> jump1,” wherein, if condition1 evaluates to true, then operational flow of instruction execution jumps to a target address specified by the jump1 label (this scenario may also be referred to as the branch instruction (jump1) being “taken”). On the other hand, if condition1 evaluates to false, then the operational flow may continue to execute the next sequential instruction after the conditional branch instruction, without jumping to the target address. (This scenario is also referred to as the branch instruction not being taken, or being “not-taken”). Under certain instruction set architectures (ISAs), instructions other than branch instructions may be conditional, where the behavior of the instruction would be dependent on the related condition.
In general, the manner in which the condition of a conditional instruction will be resolved will be unknown until the conditional instruction is executed. Waiting until the conditional instruction is executed to determine the condition can impose undesirable delays in modern processors which are configured for parallel and out-of-order execution. The delays are particularly disruptive in the case of conditional branch instructions, because the direction in which the branch instruction gets resolved will determine the operational flow of instructions which follow the branch instruction.
In order to improve instruction level parallelism (ILP) and minimize delays, modern processors may include mechanisms to predict the resolution of the condition of conditional instructions prior to their execution. For example, branch prediction mechanisms are implemented to predict whether the direction of the conditional branch instruction will be taken or not-taken before the conditional branch instruction is executed. If the prediction turns out to be erroneous, the instructions which were incorrectly executed based on the incorrect prediction will be flushed. This results in a penalty known as the branch misprediction penalty. If the prediction turns out to be correct, then no branch misprediction penalty is encountered.
Branch prediction mechanisms may be static or dynamic. Branch prediction itself adds latency to a pipeline, otherwise known as the branch prediction penalty. When an instruction is fetched from an instruction cache and processed in an instruction pipeline, branch prediction mechanisms must determine whether the instruction that is fetched is a conditional instruction and whether it is a branch instruction and then make a prediction on the likely direction of the conditional branch instruction. It is desirable to minimize stalls or bubbles related to the process of branch prediction in an instruction execution pipeline. Therefore, branch prediction mechanisms strive to make a prediction as early in an instruction pipeline as possible. Sometimes, pre-decode bits or metadata related to branch instructions are stored in the instruction cache, which enables acceleration of the branch prediction. Such pre-decode bits may include information pertaining to the branch type (e.g., as it relates to a program counter (PC) value, whether it is a direct or indirect branch, whether it is a return from a subroutine, etc.). Pre-decode bits can also include information about conditionality of branch instructions.
While the above prediction mechanisms exist for conditional instructions such as conditional branch instructions whose conditionality is provided within the conditional instruction itself, there is another class of instructions which are harder to predict. This class includes a block of one or more dependent instructions whose behavior is controlled by a conditionality-imposing control instruction. For example, some processor ISAs include a so-called If-Then (IT) class of instructions. The IT instructions control the behavior of an IT block of one or more dependent instructions by imposing conditionality on the one or more dependent instructions. The dependent instructions in the IT block follow the IT control instruction. More specifically, the IT control instruction may have an “If” condition, based on the resolution of which, the behavior of one or more dependent “Then” instructions are determined. In this manner, the use of IT instructions makes it possible to control the behavior of a block of one or more dependent instructions. For example, an “ITTTT” block may include an “If” instruction with a condition, followed by four “Then” instructions whose behavior depends on how the conditionality-imposing “If” control instruction evaluates. In this manner, programming efficiency may be achieved for cases where a block of one or more instructions are dependent on the same condition.
These dependent instructions are difficult to predict using the above-described prediction mechanisms for conventional conditional instructions, because the behavior of the dependent instructions is controlled by the conditionality-imposing control instruction. If the same instructions that constitute the dependent instructions are not preceded by a conditionality-imposing control instruction, then their behavior is unconditional. Thus, the likely behavior of a dependent instruction cannot be stored in pre-decode bits of the dependent instructions themselves. In other words, prediction of the likely behavior of a dependent instruction which is an unconditional branch instruction, for example, is difficult because the branch instruction, by itself is unconditional and should always be predicted as “taken.” However, the actual direction of the branch instruction is dependent on the conditionality-imposing control instruction, and thus, the behavior of the branch instruction may effectively be “taken” or “not-taken.”
Moreover, it is sometimes not possible to know in advance whether a particular instruction is a dependent instruction of a conditionality-imposing control instruction. This is because the code block containing the conditionality-imposing control instruction and the corresponding dependent instructions may straddle cache line boundaries in instruction memories. Moreover, a conditionality-imposing control instruction may come in many types and affect a varying number of dependent instructions based on the block size (i.e., number of one or more dependent instructions in the code block) of the conditionality-imposing control instruction. ISAs which support the ARM architecture, for example, include a class of instructions known as THUMB instructions. The THUMB instructions may be 32-bits or 16-bits. Since THUMB instructions come in multiple instruction lengths, it is not possible to know when processing the conditionality-imposing control instruction (e.g., the IT instruction), whether the corresponding dependent instructions will be contained within the same cache line since the number of bytes in the code block of the IT instruction would be dependent on the length of each dependent instruction.
Conventional methods of handling execution of such dependent instructions tend to be inefficient, complex, and time consuming. The conventional methods operate by reducing the execution frequency of the processor in order to allow sufficient time to ascertain the conditionality of the dependent instructions from the conditionality-imposing control instruction. Alternatively, the conventional methods introduce pipeline stalls in order to resolve the condition before executing the dependent instructions or move the resolution of the conditionality to a later pipeline stage, thus increasing branch prediction penalty. Thus, there is a need in the art to avoid the aforementioned drawbacks of conventional methods.