The present invention relates generally to the field of multithreaded processors and, more specifically, to a method and apparatus for disabling a clock signal within a multithreaded (MT) processor.
Multithreaded (MT) processor design has recently been considered as an increasingly attractive option for increasing the performance of processors. Multithreading within a processor, inter alia, provides the potential for more effective utilization of various processor resources, and particularly for more effective utilization of the execution logic within a processor. Specifically, by feeding multiple threads to the execution logic of a processor, clock cycles that would otherwise have been idle due to a stall or other delay in the processing of a particular thread may be utilized to service a further thread. A stall in the processing of a particular thread may result from a number of occurrences within a processor pipeline. For example, a cache miss or a branch misprediction (i.e., a long-latency operation) for an instruction included within a thread typically results in the processing of the relevant thread stalling. The negative effect of long-latency operations on execution logic efficiencies is exacerbated by the recent increases in execution logic throughput that have outstripped advances in memory access and retrieval rates.
Multithreaded computer applications are also becoming increasingly common in view of the support provided to such multithreaded applications by a number of popular operating systems, such as the Windows NT(copyright) and Unix operating systems. Multithreaded computer applications are particularly efficient in the multi-media arena.
Multithreaded processors may broadly be classified into two categories (i.e., fine or coarse designs) according to the thread interleaving or switching scheme employed within the relevant processor. Fine multithreaded designs support multiple active threads within a processor and typically interleave two different threads on a cycle-by-cycle basis. Coarse multithreaded designs typically interleave the instructions of different threads on the occurrence of some long-latency event, such as a cache miss. A coarse multithreaded design is discussed in Eickemayer, R.; Johnson, R.; et al., xe2x80x9cEvaluation of Multithreaded Uniprocessors for Commercial Application Environmentsxe2x80x9d, The 23rd Annual International Symposium on Computer Architecture, pp. 203-212, May 1996. The distinctions between fine and coarse designs are further discussed in Laudon, I; Gupta, A, xe2x80x9cArchitectural and Implementation Tradeoffs in the Design of Multiple-Context Processorsxe2x80x9d, Multithreaded Computer Architectures: A Summary of the State of the Art, edited by R. A. lannuci et al., pp. 167-200, Kluwer Academic Publishers, Norwell, Mass., 1994. Laudon further proposes an interleaving scheme that combines the cycle-by-cycle switching of a fine design with the full pipeline interlocks of a coarse design (or blocked scheme). To this end, Laudon proposes a xe2x80x9cback offxe2x80x9d instruction that makes a specific thread (or context) unavailable for a specific number of cycles. Such a xe2x80x9cback offxe2x80x9d instruction may be issued upon the occurrence of predetermined events, such as a cache miss. In this way, Laudon avoids having to perform an actual thread switch by simply making one of the threads unavailable.
A multithreaded architecture for a processor presents a number of further challenges in the context of an out-of-order, speculative execution processor architecture. More specifically, the handling of events (e.g., branch instructions, exceptions or interrupts) that may result in an unexpected change in the flow of an instruction stream is complicated when multiple threads are considered. In a processor where resource sharing between multiple threads is implemented (i.e., there is limited or no duplication of functional units for each thread supported by the processor), the handling of event occurrences pertaining to a specific thread is complicated in that further threads must be considered in the handling of such events.
Where resource sharing is implemented within a multithreaded processor it is further desirable to attempt increased utilization of the shared resources responsive to changes in the state of threads being serviced within the multithreaded processor.
According to the present invention, there is provided a method that includes maintaining an indication of a pending event with respect to each of multiple threads supported within the multithreaded processor. An indication of an active or inactive state is maintained for each of the multiple threads supported within the multithreaded processor. A clock disable condition, indicated by the indication of no pending events with respect to each of the multiple threads and inactive state for each of the threads is detected. A clock signal, if enabled, is disabled with respect to at least one functional unit within the multithreaded processor responsive to the detection of the clock disable condition.
Other features of the present invention will be apparent from the accompanying drawings and from the detailed description which follows.