When executing a software program a processor typically fetches instructions from memory and then executes the instructions. The processor generally begins at a starting instruction and executes instructions sequentially from lower linear memory addresses to higher memory addresses until an instruction that indicates that the next instruction be executed is not the next sequential instruction. Such instructions are referred to herein as change-of-flow (COF) instructions. Examples of COF instructions may include, but are not limited to, jumps, conditional branches, calls, returns and interrupt instructions. When a COF instruction indicates that the next instruction to be executed is not the next sequential instruction, the COF instruction typically indicates, either explicitly or implicitly, the address of the next instruction to be executed. The address of the non-sequential instruction to be executed after a COF instruction is called the COF instruction's “target”. In certain instances, a COF instructions target may be the next sequential instruction.
Conditional COF instructions, such as conditional branches, may be predicted as either “taken” or “not taken”. If a COF instruction is predicted as “not taken” (i.e., presumed to not branch), then the instruction executed after the COF instruction is the instruction at the next sequential address. Conversely, if a COF instruction is predicted as “taken” (i.e., presumed to branch) then the instruction executed after the COF instruction is the “target” of the COF instruction. Unconditional COF instructions are always taken.
In the absence of COF instructions, the processor typically requests consecutive addresses from an instruction cache and sends the resulting instruction data from the instruction cache directly to an instruction pipeline. However, if one or more COF instructions are present, the processor typically attempts to predict the instruction to be executed following each COF instruction and then provide the instruction pipeline with the instruction data resulting from that prediction. Various mechanisms may be implemented to detect the existence of COF instructions that are predicted to be taken. Typically, the transfer of instruction data from an instruction cache to the processing pipeline may be delayed for any of a variety of reasons, such as delays caused by linear to physical address translation, memory fetches, and the like. Accordingly, many processor architectures utilize an instruction data buffer to buffer data received from the instruction cache prior to providing it to the processing pipeline for decoding and execution. However, in many processor architectures COF instructions may have variable lengths. This complicates alignment of COF instructions with the “target” instruction of the COF instruction in the instruction buffer. Data segments received from an instruction cache generally cannot be directly placed into the instruction buffer when a predicted taken COF is present within the data segment. Accordingly, conventional techniques have been developed to attempt to align data segments in the presence of COF instructions. To implement these conventional techniques, the starting address of the COF instruction is tracked and at least two instruction buffers typically are used with each buffer entry. When a predicted taken COF was detected, the instruction stream starting with the “target” of the COF instructions was stored in an instruction buffer separate from the buffer containing the COF instruction. Additional information is required to know when to switch from one buffer to another. These conventional techniques typically resulted in a delay (i.e., a “bubble”) in the pipeline before the corresponding target instruction is fetched, thereby diminishing the performance of processors implementing these conventional techniques. Accordingly, a technique for improved COF instruction detection and alignment within an instruction buffer would be advantageous.