A processor, also referred to as a central processor unit (CPU), is the hardware within a computer that carries out the instructions of a computer program by performing the basic arithmetical, logical, and input/output operations of the system. Conventional processors can have a variety of architecture features that can include but are not limited to wide architectures and pipelined architectures.
Processors that have wide architectures are capable of fetching and decoding multiple cache lines of instructions in parallel. In order to optimally support such wide architectures the processor frontend must be capable of supplying multiple cache lines of instructions to the processor scheduler and execution units during each clock cycle.
In addition, processors can encounter a variety of branch instruction types that can present challenges as regards supplying multiple cache-lines of instructions to the processors' scheduler and execution units during each cycle (because of complex program control flows). Such instructions can include what are termed “far branch” instructions and “near branch” instructions (e.g., loop instructions). Far branch instructions are instructions that can alter the flow of instruction execution in a program wherein instruction execution jumps outside of a cache line. Loop instructions are instructions that include a sequence of statements that are specified only once but that are carried out several times in succession before the loop is exited (and can involve jumps within a cache line).
In pipelined architectures multiple sequential instructions are executed simultaneously. However, the pipeline can only be fully utilized if the processor is able to read a next instruction from memory on every cycle. Importantly, the processor must know which instruction is to be next read in order to read that instruction. However, when a far branch instruction is encountered, the processor may not know ahead of time the path that will be taken and thus which instruction is to be next read. In such instances, the processor has to stall until this issue can be resolved. This process can degrade utilization and negatively impact processor performance especially where high-performance processors are concerned and the supply of high throughput from the front end of the device is important.