Temporal multithreading is known in the art as a technique that uses one set of execution resources to execute multiple “programs,” or “threads.” These execution resources often include an array of pipeline execution units. Instructions for a program thread are processed through the pipeline until it stalls; in a “switch” event, those stalled instructions are then removed and instructions from another thread are injected to the same pipeline so as to efficiently utilize execution resources.
Temporal multithreading thus gives the appearance of multiple central processing units (“CPU”). Each thread processes through the execution units as if the program had the entire control of the execution units; activation and deactivation of various threads occurs in hardware control logic based on multiple switching events in an attempt to maximally utilize the execution units.
There is a penalty associated with the above-mentioned switch events. Accordingly, the prior art has developed certain objective criteria for a switch event. In one example, a cache miss triggers a switch event because the processor needs to acquire data from main memory. In another example, a time out counter counts the cycles of a thread's execution and promotes an automatic switch for an out-of-bounds thread execution duration.
There is the need to further reduce the negative effects of switching events in high performing processors. By reducing or improving processing of switch events, a processor will have increased performance, by improving instruction processing efficiency across multiple threads. One feature of the invention is therefore to provide a processor with intelligent logic for efficiently processing and switching multi-threaded programs through the processor. Several objects and other features of the invention are apparent within the description that follows.