1. Field of the Invention
The present invention relates generally to instruction decoding for computer processors, and more specifically to pipelined instruction decoders for microprocessors.
2. Background Information
Basic instruction decoders and instruction decoding techniques used in central processors and microprocessors are well known. With advancements in design, instruction decoders have become more sophisticated to include not only pipeline registers to process instructions in sequence but buffers to temporarily store preliminary decoded instructions while others instructions continue to be processed. However, buffers have limited depth and can become filled so that further instructions can no longer be stored into them. In prior art processors when a buffer became full, the entire instruction decode pipeline would stall. Stalls can-occur for other reasons in a microprocessor when a subsystem can not handle the amount of data throughput provided by previous subsystems so data is not lost. Essentially, an instruction decode pipeline is stalled when no further instructions can be decoded in the instruction decode pipeline.
Also in prior art processors, if an instruction became stale or invalid in the instruction decode pipeline, such as from a cache coherency problem, it required clearing. Clearing essentially invalidates the instructions so that they can be disregarded and overwritten with valid instructions. In prior art processors, all instructions, including valid instructions, are cleared (i.e. invalidated) within the instruction decode pipeline on a global basis. In which case, valid instructions which have been cleared need to input back into the beginning of the instruction decode pipeline to start the decoding process again. Global clearing such as this tends to delay the execution process when a stale or invalid instruction becomes present in the pipeline of prior art processors.
In processors, reducing power consumption is an important consideration. In order to conserve power in prior art processors, entire functional blocks of synchronous circuitry within the execution unit have their clocks turned OFF. That is, their clock signals are set to a stable state throughout entire functional blocks. In order to accomplish this, prior art power down control logic was used to determine when an entire functional block is idle and can have its clocks shut off. By shutting the clocks OFF to synchronous circuits, signals, including the clock signal, do not change state. In which case transistors are not required to charge or discharge capacitance associated with the signal lines and therefore power is conserved. However, because the clocks are shut OFF throughout entire functional blocks, the prior art processor has to wait until all functions are completed within such blocks. This causes the prior art processor to rarely shut OFF clocks to the functional blocks such that little power is conserved over time.
It is desirable to overcome these and other limitations of the prior art processors.
The present invention includes a method, apparatus and system as described in the claims.
Briefly in one embodiment, a microprocessor includes an instruction decoder of the present invention to decode multiple threads of instructions. The instruction decoder has an instruction decode pipeline. The instruction decode pipeline decodes each input instruction associated with each thread. The instruction decode pipeline additionally maintains a thread identification and a valid indicator in parallel with each instruction being decoded in the instruction decode pipeline.
Other embodiments are shown, described and claimed herein.