Multi-issue processors exhibit a lot of parallel hardware to enable the concurrent execution of multiple operations in a single processor cycle and thus exploiting instruction-level parallelism in programs. Examples of multi-issue processors are VLIW (Very Large Instruction Word) processors and superscalar processors. In case of a VLIW processor, the software program contains full information regarding which operations should be executed in parallel and these operations are packed into one very long instruction. The compiler ensures that all dependencies between operations are respected and that no resource conflicts can occur. Apart from this program information the hardware does not require any additional information to correctly execute the program, which results in relatively simple hardware. In case of a superscalar processor the software to be executed is presented as a program composed of a sequential series of operations. The processor hardware itself determines at runtime which operation dependencies exist and decides which operations to execute in parallel based on these dependencies, while ensuring that no resource conflicts will occur. A relatively simple compiler suffices for translating a high-level programming language to sequential code, but the processor hardware is very complex.
In multi-issue processors, the parallel hardware responsible for executing these operations is organized in issue slots. Each issue slot contains one or more functional units that perform the actual operations. Commonly, in every processor cycle a single operation is started on one functional unit in every issue slot. In some processors more than one functional unit is put in an issue slot as a trade-off between maximum available parallelism and instruction width cost, in case of a VLIW processor, or hardware complexity, in case of a superscalar processor.
Since in each clock cycle at most one operation can be started on one functional unit in each issue slot, power may be wasted by functional units in that issue slot that are not being used in a given processor cycle. If the input of these functional units changes during the time that they are not used they will still consume comparable power to when they are being used, even though their output is irrelevant.
This waste of power can be eliminated by putting holdable registers, i.e. a register, the state of which remains unchanged in case of a different input, at the inputs of all functional units within an issue slot. These holdable registers will leave the inputs of the functional units unchanged, when these functional units are not being used. Since the inputs of these functional units remain unchanged, no combinatorial gates are switched and no dynamic power dissipation occurs. These holdable registers can be implemented, for example, by means of clock gating. Another advantage of these registers is that the additional pipeline stage they are forming allows running the processor at a higher clock frequency. A disadvantage of adding registers to all inputs of functional unit inputs is that it increases the amount of state that must be saved during interrupts. An interrupt allows a processor to quickly respond to external events and it causes the processor to temporarily postpone the further execution of the current program trace and instead perform another trace. The state of the postponed trace must be saved such that, when the interrupt has been serviced, the processor can restore its original state and can correctly proceed with the original trace. In order to obtain a predictable and short interrupt latency, it must always be possible to interrupt the processor whenever desired. This is especially important in real-time applications. Interrupting a processor at an arbitrary point in the program can imply that a significant amount of state must be saved.
The non-prepublished European patent application 00203591.3 [attorneys' docket PHNL000576], filed on 18 Oct. 2000, provides a solution for decreasing the amount of state that must be saved during interrupts. A second compact instruction set is applied, that is used in an interrupt service routine and only uses a limited set of processor resources. In case of an interrupt, it is sufficient to save the state of only the limited set of processor resources used by the second compact instruction set, while simply freezing the state in all other resources. However, the resources used by the second compact instruction set still have a considerable amount of state that must be saved and restored during interrupts, when registers are put at all the inputs of each functional unit in this limited set of resources.