In modern low-power central processing units (CPUs), program instructions are executed in highly specialized execution units in order to achieve low energy consumption. Each execution unit is optimized for the instruction group it executes. In this way, only a minimum number of gates toggle during the execution of an instruction. The other data paths of the CPU remain quiet. One such special instruction group is used to address signal-conditioning operations that use vector computations (FFT, FIR filtering, IIR filtering, etc). Such vector computations can be implemented using an application-specific instruction-set processor (ASIP) targeted for signal conditioning algorithms.
To achieve low-power, the arithmetic supported by certain signal-conditioning ASIPs is targeted for a broad sub-class of applications that mostly require only 16-bit arithmetic. The accelerator hardware defined by the set of instructions in such ASIPs only supports basic operators of 16-bit multiply and 32-bit addition. It would be beneficial to be able to support a full 32-bit arithmetic, such as 32-bit multiply, without significant overhead to the basic 16-bit arithmetic operators/instructions.