Instruction execution circuits within microprocessors include address generation units that decode addresses encoded within microprocessor instructions. The decoded addresses specify the locations in memory containing instructions to be executed or data to be accessed. Many present microprocessors feature advanced architectures that allow parallel processing and pipelined instruction execution. Such architectures allow microprocessors to decode, dispatch, and complete execution of (retire) multiple instructions in a single clock cycle. For example, in the Pentium.RTM. Pro microprocessor produced by Intel Corporation, a three-way superscalar, pipelined architecture allows for retirement of as many as three instructions per clock cycle. "Pentium" and "Pentium Pro" are registered trademarks of Intel Corporation of Santa Clara, Calif.
Parallel processing techniques and the use of fast temporary memory, such as caches for instructions and data, require extensive decoding of address information to generate proper memory locations from which to fetch instructions and data. For example, code that contains multiple levels of branches and procedure calls that allow for out-of-order instruction execution produce often complex address relationships that must be properly resolved for proper instruction execution. The use of traditional fixed addresses in such processing environments is often insufficient to execute modern complex code. Accordingly, most present microprocessors use dynamic address schemes in which addresses are provided through address components that are derived and combined to produce linear address values.
The generation of dynamic addresses requires extensive logic circuitry to decode addresses encoded within the processor instructions. Such circuitry includes adder circuits within the instruction execution units that calculate memory locations based on the encoded address information. As the speed of microprocessors increases, the speed of these adder circuits must also increase so that gate delays are minimized to ensure that addresses are generated fast enough to maintain high instruction cycle rates.
Present adder circuits typically use static combinatorial logic and multiplexer circuits to perform addition operations on address information. With present microprocessor speeds exceeding 200 MHz, and approaching 1000 MHz, these static logic output circuits introduce gate delays that often prevent the execution of multiple instructions during single clock cycles.