This invention relates generally to central processing unit architectures and more particularly to a configurable processor.
The architecture of a central processing unit is known to include an instruction cache, a fetch module, an instruction decoder, an instruction issuance module, an arithmetic logic unit (ALU), a load/store module, and a data cache. The instruction cache and data cache are used to temporarily store instructions and data, respectively. Once an instruction is cached, the fetch module retrieves it and provides it to the decoder. Alternatively, the fetch module may retrieve an instruction directly from main memory and provide it to the decoder, and may further store the instruction in the instruction cache. The decoder decodes the instruction into microcode and, via the instruction issuance module, provides it to the ALU. The ALU performs a plurality of operations and includes an address calculation module, a plurality of integer operation modules, a plurality of floating point modules, and a plurality of multi-media operation modules. The integer modules may include two arithmetic/logic modules, shift modules, one multiply module, and one divide module. The floating point modules include a floating point adder and a floating point multiplier. The multi-media modules include two multimedia arithmetic and logic modules, one multi-media shift module and one multi-media multiplier. Note that an arithmetic function is an addition operation or a subtraction operation. Further note that a logic function is a AND, NAND, compare, OR, NOR, or XOR operation. Further note that the multi-media modules are configurable to process packed data having 8 bit, 16 bit, 32 bit or 64 bit data elements.
When the ALU receives an instruction (some processors allow two or three instructions to be processed simultaneously) it provides the instruction to the appropriate module based on the operation to be performed. For example, a load-store operation will be processed by the address calculation module, such that the correct data is stored and/or loaded into the data cache, or into main memory.
When such a CPU is fabricated as an integrated circuit, it requires a large die area, yielding a large integrated circuit. As is generally known, the smaller the die, the less expensive the resulting integrated circuit will typically be. Therefore, a need exists for a central processing unit that has a relatively small integrated circuit footprint to contain costs.