Advances in semi-conductor processing and logic design have permitted an increase in the amount of logic that may be present on integrated circuit devices. As a result, computer system configurations have evolved from a single or multiple integrated circuits in a system to multiple cores and multiple logical processors present on individual integrated circuits. A processor or integrated circuit typically comprises a single processor die, where the processor die may include any number of processing elements, such as cores, threads, and/or logical processors.
Increasingly, multithreading is supported in hardware. For instance, in one approach, processors in a multi-processor system, such as a chip multiprocessor (“CMP”) system, may each act on one of the multiple software threads concurrently. In another approach, referred to as simultaneous multithreading (“SMT”), a single physical processor is made to appear as multiple logical processors to operating systems and user programs. For SMT, multiple software threads can be active and execute simultaneously on a single processor without switching. That is, each logical processor maintains a complete set of the architecture state, but many other resources of the physical processor, such as caches, execution units, branch predictors, control logic and buses are shared. For SMT, the instructions from multiple software threads thus execute concurrently on each logical processor.
Processors have to deal with a variety of events, such as, for example, faults, traps, assists interrupts, and dedicate a good amount of logic to that. That logic becomes more complicated if the processor is an out-of-order processor and supports SMT. Every cycle cores can potentially have to process a significant number of events. They are both internal (usually related to execution of instructions) or external (e.g., interrupts). A conventional multi-threaded processor supports 2-way SMT and has deployed solutions for the same problems. But it was based on logic that was either not blindly scalable to a larger number of threads or incurred in significant logic replication. For example, event evaluation and prioritizations was done for all threads in parallel. This requires replication of the event logic for each of the threads. The logic looks at “what the other thread is doing” and assumes only 2 threads are present in the system (e.g., a computer system or electronic device), for example, in deciding whether to initiate an event process if a thread is sleeping by checking if the other thread has finished the exclusive access of certain processing resources (e.g., global registers).