Many modern pipelined microprocessors include a branch target address cache (BTAC) that caches target addresses of previously executed branch instructions. When a cache line is fetched from the microprocessor's instruction cache, the fetch address is provided to the BTAC and the BTAC uses the fetch address to predict whether there is a branch instruction present in the cache line, and whether the BTAC contains a valid target address for the branch instruction. If the branch instruction is predicted taken, the processor branches to the valid target address supplied by the BTAC. Since each cache line can store multiple instructions, the instruction cache line may contain more than one branch instruction. Consequently, some BTACs statically dedicate storage for caching two target addresses per cache line. This allows the BTAC to more accurately predict program flow since it is possible that one of the branch instructions in the cache line will be taken and the other not taken.
In the conventional BTACs, the storage for the two target addresses is fixed in the BTAC. That is, the space is statically dedicated regardless of whether two branch instructions are present in the cache line or one branch instruction is present in the cache line. In fact, in one conventional BTAC which is integrated into the instruction cache, the space is statically dedicated even if zero branch instructions are present in the cache line. However, it has been observed that only approximately 20% of the cache lines that contain a branch instruction contain two branch instructions. Consequently, the extra space in the BTAC statically dedicated for the second target address is wasted for 80% of the cache lines. For example, in a BTAC that is a 2-way set associative cache that statically dedicates storage for two target addresses per entry, since only about 20% of the cache lines include two or more branch instructions, only about 60% of the target address storage space is used to store valid target addresses.
Therefore, what is needed is a more space efficient scheme for predicting multiple branch instructions in a fetched cache line.