As used herein, “instructions” denote basic processor commands and other operations such as floating point operations. The following patents provide useful background for the invention, and are thus incorporated herein by reference: U.S. Pat. No. 5,938,755; U.S. Pat. No. 5,903,768; U.S. Pat. No. 5,898,235; U.S. Pat. No. 5,884,061; U.S. Pat. No. 5,751,984; U.S. Pat. No. 5,684,422; U.S. Pat. No. 5,557,531; U.S. Pat. No. 5,521,834; and U.S. Pat. No. 5,452,215;
Modern processors, like the PA-8000 microprocessor by Hewlett-Packard, use “pipelining” to increase throughput at relatively low cost. Pipelining is a technique whereby the processor begins executing a second instruction before the first instruction is completed. Specifically, a pipelined processor partitions a process with “m” steps into “m” hardware stages separated by registers, which hold intermediate results. Each hardware stage thus has a stage execution circuit that performs the actual step or operation. One pipelined stage has one step in the process, and stages are connected in the order that the steps are performed. By permitting each of the “m” stages to operate concurrently, the pipelined process can substantially operate at “m” times the rate of a processor without pipelining. When any stage completes its operation, the result is passed to the next stage; and final results emerge at the end of the pipeline.
Pipelines are used to accelerate execution by operating on multiple computer instructions at once. FIG. 1 shows relevant structure within one illustrative prior art pipelined processor 10; and FIG. 2 shows an exemplary six-stage pipeline 20. Consistent with later-generation processors, processor 10 issues and retires more than a single instruction 22 per clock cycle, as illustrated in FIG. 2. In the first stage, the fetch (F) stage, processor 10 tells its cache 12 which instruction 22 to next put into register pipeline 14, containing separate register columns 14a-14f and stage execution circuits 20a-20e. Pipeline stages are separated by register columns 14a-f, each holding intermediate results for respective stages of the pipeline. The many outputs of register pipeline 14 are illustratively shown as transferred to a results section 16 within processor 10 for use in further operations.
For illustrative purposes, register columns 14a-f are shown with only three registers each; while the typical length of register columns 14a-14f has many more distinct registers. A particular pipeline process transpires across a particular row of register columns 14a-14f, such as the row of registers 15b. Stage execution circuits 20a-e execute the step or operation between respective register columns 14a-f. Each stage execution circuit 20 dissipates heat associated with the step or instruction being processed at that stage. Certain steps or instructions such as floating point operations dissipate more heat in circuits 20 than other steps or instructions in circuits 20.
The next stage after the F stage is the instruction decode (ID) stage, which might for example indicate an “add” or “subtract” or floating point (“FP”) calculation. The ID stage also starts to acquire the operand values from the appropriate register columns 14a-14e. 
Instructions are executed at the EX stage, here shown with two separate stages EX1 and EX2. Associated stage execution circuits 20c, 20d serve to process operations associated with these stages.
The memory stage (M) corresponds to a memory operation, if any; and the write stage (W) operates to write the result or float value at the sixth stage of the register pipeline 14. Results that emerge from register pipeline 14 are available to processor 10, illustratively, at result section 16.
Note that as shown in FIG. 2, two instructions 22 are clocked simultaneously for a given cycle. Thus, for example, the first two instructions 22 start at cycle 1 and complete simultaneously at cycle 6.
Those skilled in the art should appreciate that other forms of pipeline processing are known. For example, Hewlett-Packard's PA-8000 processor has a two-level process, with one pipeline for instructions and a separate pipeline for floating point operations. Furthermore, the number of stages in a pipeline also varies. However, the maximum throughput of a single pipeline process is one instruction per cycle.
The afore-mentioned processors are typically at the heart of all personal computers, work stations and servers, i.e., computing “systems”. Often, it is desirable to have more than one such processor within a single system. However, one difficulty with adding additional processors within computing systems is in compensating for power dissipation: pipelined processors generate heat, particularly within stage execution circuits 20 and register columns 14a-14f, FIG. 1; and this heat must be dissipated by the system's cooling capabilities or the processor will fail. In the prior art, power dissipation in a pipelined processor is based upon the instantaneous dissipation of specific instructions and pipeline length integrated over time. However, the clock frequency and pipeline lengths are such that instantaneous power is not a good indicator of average power dissipation; and yet this calculated average power dissipation is used to determine the cooling requirements of the prior art system. Accordingly, this calculated average power is essentially a “worst case” power evaluation (i.e., an estimate based on maximal utilization of execution circuit resources) that unnecessarily (a) limits the numbers of processors which can be installed within a system or (b) over-specifies the cooling requirements of a system, adding cost, weight and unnecessary structure to a system. Other prior art methods for controlling pipeline processor power dissipation are also problematic. By way of example, control based on current pipeline snapshot is too reactive for the entire computing system. Control based on extended pipeline information requires significant additional hardware.
The prior art is also familiar with thermal sensors on die, used to monitor heat dissipation; however such sensors are complex and difficult to use in meaningful calculations.
It is, therefore, one object of the invention to provide a pipelined processor which variably dissipates processor power according to the actual processing needs of the computing system. Another object of the invention is to provide methods of controlling power dissipation of a pipelined microprocessing system in a manner that is correlated to the types of operations under process. Still another object of the invention is to provide a method of throttling instructions to a pipeline within a processor in a manner functionally related to the physical heat generated by the processor. These and other objects will become apparent in the description that follows.