1. Field of the Invention
This invention relates to multithreaded processors of the type having a hardware scheduling mechanism for interleaving execution of program instructions from a plurality of program threads. More particularly, this invention relates to the efficient provision of a branch prediction mechanism within such multithreaded processors.
2. Description of the Prior Art
It is known to provide multithreaded processors in which program instructions from a plurality of program threads are interleaved for execution by a hardware scheduling mechanism. Such techniques are useful in improving the overall performance of a processor since while each thread may execute more slowly than if it had exclusive use of the processor resources, the combined processing performed by all threads normally exceeds that which could be achieved in a single thread. By executing multiple threads it is possible when one thread is stalled (such as due to a data interlock or a memory abort) for another thread to continue processing and utilise what would otherwise be unused processor cycles.
Another technique used within high performance processors is a branch prediction mechanism. In a highly pipelined processor program instructions are fetched from memory and start progressing along the instruction pipeline prior to it being determined whether or not a conditional branch instruction will or will not be taken. Such conditional branch behaviour changes the program flow and accordingly the sequence of instructions which should be fetched following that conditional branch instruction. In order to reduce the probability of incorrect instructions being fetched, it is known to provide mechanisms which seek to predict whether or not a particular conditional branch instruction will or will not result in the branch being taken or not taken. Various techniques exist for performing such branch prediction.
One known technique of branch prediction is to use a history register which stores a pattern indicating the behaviour of previously encountered conditional branch instructions, i.e. whether those branch instructions were taken or not taken. That stored pattern can then be used as an index into a history table which stores a prediction associated with each pattern of preceding branch behaviour. It is found that there is a strong correlation between preceding branch behaviour and a prediction which can be made for a newly encountered conditional branch instruction. A particular path through a program will have a distinctive pattern of preceding branch behaviour and there is a strong correlation between the branch outcome during successive such paths through a program whereby previous branch behaviour can be noted and used to generate a prediction associated with that previous pattern of branch behaviour as represented by the history register value.
In the context of multithreaded processors, the behaviour of the different program threads with respect to their preceding branch behaviour and branch prediction will be substantially independent such that a particular pattern of preceding branch behaviour for one thread will have one predicted behaviour associated with it whereas the same preceding pattern of branch behaviour for another thread could have a quite different and independent predicted behaviour. One solution to this problem would be to provide separate history tables for storing the predicted behaviour and indexed by separate history values representing preceding branch behaviour. However, the provision of separate history tables is inefficient in terms of gate count, circuit area, power consumption, cost etc.
Another solution would be to make the different threads share a common global history table and rely upon the history register values for one thread being unlikely to correspond to the history register values for another thread and accordingly the predictions for those two threads not competing for the same prediction values storage location within the shared global history table. While this might seem a reasonable approach since the branch predictions are in any case not perfect and significant numbers of mispredictions do arise with the consequent existing provision of mechanisms for recovering from such mispredictions, a further problem is that in practice some forms of preceding branch behaviour are statistically more common than others, e.g. it has been observed that taken branches represent approximately 70% of the real life total with non taken branches representing approximately 30% of the real life total. Accordingly, multiple program threads in practice compete to use the more popular index locations within such a shared global history table making the undesired overwriting of one prediction with a different prediction from a different thread more common than it might be considered purely from the size of the global history table.