A central processor unit (CPU) also referred to as a processor, is the hardware within a computer that carries out the instructions of a computer program by performing the basic arithmetical, logical, and input/output operations of the system. Conventional processors can have a variety of architecture features that can include but are not limited to wide architectures, pipelined architectures and emulated architectures.
Processors that have wide architectures are capable of fetching and decoding multiple cache lines of instructions in parallel. In order to optimally support such wide architectures the processor frontend must be capable of supplying multiple cache lines of instructions to the processor scheduler and execution units during each clock cycle.
Processors that feature emulation allow software applications and operating systems written for other computer processor architectures to be run on the processors. Such processors have the capacity to duplicate (or emulate) the functions of another computer system (the guest architecture) such that the behavior emulated by the processors closely resembles the behavior of the other computer system (the guest architecture).
In emulated architectures, both native and guest branch (instructions) can be encountered. Native-branch instructions are branch instructions whose target is an address in the native-space. Guest-branch instructions are branch instructions whose target is an address in guest-space. Accordingly, a hardware structure such as a conversion-lookaside-buffer (CLB) is required to maintain the mapping of guest-to-native addresses.
In addition to native and guest branches, processors can encounter a variety of branch instruction types that can present challenges as regards supplying multiple cache-lines of instructions to the processors' scheduler and execution units during each clock cycle (because of complex program control flows). Such instructions can include what are termed “far branch” instructions and “near branch” instructions (e.g., loop instructions). Far branch instructions are instructions that can alter the flow of instruction execution in a program wherein instruction execution jumps outside of a cache line. Loop instructions are instructions that include a sequence of statements that are specified only once but that are carried out several times in succession before the loop is exited (and can involve jumps within a cache line).
In pipelined architectures multiple sequential instructions are executed simultaneously. However, the pipeline can only be fully utilized if the processor is able to read a next instruction from memory on every cycle. Importantly, the processor must know which instruction is to be next read in order to read that instruction. However, when a far branch instruction is encountered, the processor may not know ahead of time the path that will be taken and thus which instruction is to be next read. In such instances, the processor has to stall until this issue can be resolved. This process can degrade utilization and negatively impact processor performance.
Additionally, in some conventional processors, when a loop is encountered, instructions of the loop that are required to be repeated in successive iterations of the loop may need to be accessed in different clock cycles. This requirement can limit the number of instructions that can be forwarded per cycle. Accordingly, such processors can exhibit unsatisfactory latencies attributable to the delays in reading instructions from memory.