Processor architectures utilizing multiple functional units are suitable alternatives for exploiting instruction-level parallelism (ILP) in programs, that is, for executing more than one basic (primitive) instruction at a time and thereby allowing for faster microprocessors. One type of processor architecture utilizing multiple functional units is a Very-Long Instruction Word (VLIW) processor. VLIW processors fetch a Very-Long Instruction Word, which contain several primitive instructions, from an instruction cache and dispatch the entire VLIW for parallel execution by the functional units. These capabilities can be exploited by compilers, which generate code that has grouped together independent primitive instructions executable in parallel. The processors have relatively simple control logic because they do not perform any dynamic scheduling or reordering of operations (as is the case in most contemporary superscalar processors). In hardware terms, a VLIW processor very simply consists of a collection of functional units (adders, multipliers, branch units, etc.) connected by a bus, plus some registers and caches.
The instruction set for a VLIW architecture tends to consist of simple instructions. The compiler must assemble many primitive operations into a single “instruction word” such that the multiple functional units are kept relatively busy, which requires enough instruction-level parallelism (ILP) in a code sequence to fill the available operation slots. Such parallelism is uncovered by the compiler through scheduling code speculatively across basic blocks, performing software pipelining, reducing the number of operations executed, among others.
As with all computer architectures, it is important to design VLIW processors to have lower power consumption. Electronic systems having lower power consumption have various advantages such as reduced operational costs, lower amounts of heat dissipation, and longer operational lives for systems running on batteries. One conventional method for saving power has been to simply to discontinue the supply of power to entire functional units, for example, by discontinuing the flow of current to these units. Another conventional method for saving power in VLIW processors has been implemented by selectively discontinuing the supply of power to sections of a functional unit that perform specific functions. For example, a floating-point functional unit generally has “functional sections” for addition, subtraction, division, multiplication, square root, etc. According to this convention, each “functional” section that is not required to perform its specific function will have its supply of power or clocking discontinued until that “functional” section of the functional unit is needed again. Even though these conventional approaches reduce the amount of power required by VLIW processors, they leave great room for improvement with respect to power conservation. For example, these conventional approaches require that an entire “functional section” or functional unit be supplied with power even though only a small portion of the “functional section” or pipe stage of the functional unit is being utilized.
In view of the foregoing, there is a need for pipelined processors capable of operating with reduced power consumption.