1. Technical Field
This disclosure relates to processors, and more specifically to preventing branch predictor corruption in out-of-order processors.
2. Description of the Related Art
Modern superscalar microprocessors achieve high performance by executing multiple instructions in parallel and out-of-program-order. Control transfer instructions (CTIs) such as branches, calls, and returns, which are highly prevalent in programs, can cause pipelined microprocessors to stall because the instructions to be executed after the control transfer instructions are not known until the control transfer instruction is executed. These stalls can result in significant loss of performance.
Modern microprocessors employ branch prediction techniques to speculatively fetch and execute instructions beyond CTIs. Branch prediction involves predicting the direction and the target of the CTI. If the CTI is mispredicted either due to the direction prediction or the target prediction being incorrect, then all instructions speculatively fetched beyond the CTI are thrown away (flushed), and new instructions are fetched by the Instruction Fetch Unit (IFU) from the correct path. Also, upon detection of a mispredicted CTI, a branch predictor is typically updated using the actual results of the CTI to enhance its future prediction accuracy. In some microprocessors, the branch predictor is updated with the results of every CTI.
If a CTI is mispredicted, instructions speculatively fetched beyond the CTI may be further CTI instructions that are younger than the mispredicted CTI. Before flushing speculative instructions beyond the CTI, processors must execute instructions older than the CTI. While waiting for such older instructions to execute, younger CTIs from a mispredicted speculative execution path may degrade branch predictor accuracy if allowed to update the branch predictor. Such younger CTIs may also cause spurious instruction flushes and/or incorrect updates to the IFU.
Modern microprocessors commonly implement chip level multi-threading (CMT) to improve performance. In CMT processors, multiple software threads are concurrently active in the processor, and each active thread has dedicated hardware resources to store its state. Efficient execution of instructions from multiple software threads requires ability to predict CTIs from different threads. Execution of multiple threads on CMT processors may cause execution of CTIs from different threads to be interleaved.