Multiplication or division operations need to be executed in large numbers in modern chip-based systems (known as System-on-Chip, SoC). It is therefore of great significance to implement these operations with as little complexity as possible in such systems. In this context, it should be remembered that the processing of signals in such chip-based systems often encounters various signal-processing phases and is performed in different circuit areas. Typically, an analogue signal-processing phase is followed by a digital signal-processing phase, and the digital signal processing can be divided further into hardware-based and software-based functionality implementations.
Analogue signal processing is inherently implemented by analogue circuits. By way of example, a terminal in a wireless communication system receives the transmitted signal via an antenna, the said signal being transferred to baseband in the “radio-frequency front end” using analogue circuits and being subjected to analogue/digital conversion. The subsequent digital signal-processing then initially includes data demodulation and decoding, which is often implemented largely on a hardware basis for reasons of efficiency. Subsequent signal-processing phases are then usually programmed on a digital signal processor (DSP) or on a microcontroller (i.e. in a firmware or software implementation, since this form of implementation is more efficient and/or more flexible than hardware implementation.
The text below considers only the digital signal-processing area of chip-based systems. A fundamental design problem is finding a functionality implementation which meets the critical demands on power consumption and chip area requirement as well as possible. A crucial factor in this context is whether a prescribed functionality is implemented in hardware or software/firmware or on a distributed basis.
Hardware-based multiplication operations or division operations using real numbers arise, by way of example, when normalizing hardware-implemented accumulation operations of summands in different quantities. The real numbers (multiplier or divisor) are dependent on the quantity of the accumulated summands and vary in a time interval which is known a priori. By way of example, in the case of UMTS (Universal Mobile Telecommunications System), the variable data rates demanded in the UMTS specification mean that some parts of the receiver are faced with the difficulty that accumulation operations produce a greatly varying quantity of summands, which gives rise to the need to normalize the accumulation results using variable normalization factors.
FIG. 1 shows a conventional implementation for performing normalization operations using a variable normalization factor 1/K. In this case K is an integer. The circuit is implemented in hardware (“hardware area”) and firmware (“firmware area”). A preceding hardware block 1 provides data values ak from a data stream. k denotes the discrete time. This data stream ak is supplied to a hardware circuit 2 for the purpose of averaging. The hardware circuit 2 calculates the mean value
  b  =            1      K        ⁢                  ∑                  k          =          1                K            ⁢                        a          k                .            For this, an accumulator 3 is used to accumulate (sum) the desired number of K data values ak. The sum
      ∑          k      =      1        K    ⁢      a    k  has a word length Ba. It is supplied to a multiplication unit 4 having the required word length Ba. The other multiplier (normalization factor 1/K) is calculated partly in firmware and partly in hardware. Inversion 5 in firmware is used to ascertain the multiplier 1/K from the number K. This multiplier 1/K has a maximum word length Bk. It is transferred to the hardware area via a bus 6 of word length Bk and is stored in a register 7 there. The multiplier 4 accesses the register 7 and calculates the value b.
The value b has a maximum word length Bp. Since the numbers are in fixed-point notation, a scaler 8 typically performs subsequent word length reduction from Bp to Bb (Bp>Bb). The scaled mean value b is then transferred to a subsequent hardware block 9.
The circuit shown in FIG. 1 has the following drawbacks:                The register 7 for storing the multiplier (in this case 1/K) and the multiplication unit 4 need to be designed for the respective maximum word lengths Bk.        The firmware/hardware transfer of the multiplier (in this case 1/K) via the bus 6 needs to be designed for the maximum word length of the multiplier and its maximum frequency of change. For small values of K, the frequency of change may be in the order of magnitude of the data rate of the data stream ak, that is to say very large.        