The present invention relates to the parallel decoding of multiple instructions, and in particular to the use of a branch prediction cache in connection with such parallel decoding.
A typical processing circuit will fetch a next instruction block from memory in order to execute the next instruction in a program. Since instructions may vary in length, a block of instructions is typically fetched into an instruction buffer which is larger than the largest instruction length. It will then be necessary for the processor to determine which of the bits in the fetched block belong to the instruction. In other words, the instruction length must be determined. If the instruction is short enough so that the second sequential instruction is also within the same block, its length must also be determined, as well as the starting bit for that instruction. The second instruction must then be aligned before being presented to decoding logic for actually executing the instruction.
The instruction is examined, and if it is a jump or branch, the processor will go ahead and fetch the instruction branched too. This will typically be done even before it is determined if the branch is to be taken, so as to reduce the delay. This involves fetching an instruction other than the next sequential instruction, so the instruction buffer must be reloaded. Often, a branch target cache is used for this purpose. The cache will store the instructions branched to (the target instructions) for the most recently executed branches.
One way to increase the processing speed is to decode two instructions in parallel. Thus, the lengths of the two instructions and their starting addresses must be determined and then the instructions can be separated and aligned for decoding with parallel decoding circuitry. A problem arises when branch instructions are encountered, since the branch will cause a delay or bubble in the pipeline between the instruction buffer and the decoding circuitry. This delay is caused by the need to determine the length of the target instruction and align the next sequential instruction after the target instruction.