Digital multipliers are in wide-spread use in Digital Signal Processors (DSPs) for rapid multiplication of binary numbers. Many fundamental DSP algorithms such as FIR filters, IIR filters, convolution and Fast Fourier transform (FFT) depend heavily on multiply-accumulate performance of the DSP rendering the digital multiplier a vital component of the DSP. The digital multiplier is typically accompanied by an adder to form a fast multiply-accumulate (so-called MAC) computational structure. The binary numbers can be represented in various binary number formats such as two's complement, signed magnitude etc. The binary numbers may be represented in fixed-point format or floating point format. The number of bits used to represent each of the N-bit multiplicand (Y), i.e. N, and the M-bit multiplier (X), i.e. M, can vary widely depending on format and requirements of a particular application. The number of bits used for representing each of the N-bit multiplicand and M-bit multiplier typically lies between 8 and 56. These traditional MAC structures are well-adapted to provide fast multiplication and addition of input operands or variables of the above-mentioned fundamental DSP algorithms.
However, a significant number of signal processing algorithms require base math functions which are computationally hard, examples of these are logarithms, exponents, dividers and square roots. These computationally hard functions share the common feature of being difficult to map to binary math, e.g. logarithm functions, or that they possess a non-deterministic property, e.g. division operations. In this context, non-deterministic means that it is highly difficult or impractical to predict a resulting mathematical sequence in advance. This fact leads to arithmetic circuit designs that have to search for a correct solution. As example one can compare the predictability of the mathematical sequences of a digital multiplier and a digital divider. In the design of traditional signal processing algorithms and programmable digital signal processors rapid and efficient calculation of these computationally hard mathematical functions has largely gone unnoticed. The approach has been to solve these computationally hard mathematical functions by software routines exploiting the traditional MAC structure of the programmable DSP or, in the alternative, by building a customized digital state machine or customized data path exclusively adapted to compute a specific type of hard mathematical function. This specific math function could be a logarithm.
However, a DSP build around such a customized digital state machine or customized data path lacks flexibility to execute other types of computationally hard mathematical functions and the associated DSP algorithms in a rapid and energy efficient manner. Traditional MAC structures of programmable DSPs can be adapted to execute a wide-range of DSP algorithms by suitably configured DSP software. However, the traditional MAC structure is unfortunately slow and power inefficient when it comes to executing the above-mentioned computationally hard mathematical functions. This is because the traditional MAC structure is optimized to perform multiplication, addition and subtraction operation and hence by design ill-suited to compute other types of mathematical functions. Therefore, a large number of MAC cycles are typically consumed by program routines computing the hard mathematical functions. This imposes a high computational load on the traditional
MAC structure harming computational performance of the programmable DSP by blocking or delaying computations of other mathematical functions forming part of the DSP algorithm(s) in question.
Hence, there is a need for a datapath circuit which comprises a traditional digital multiply and accumulate circuit (MAC), for performing efficient and rapid multiplication, addition and subtraction operations, and a digital hardware calculator or math accelerator for efficient and rapid computation of the above-mentioned computationally hard mathematical functions. It would even more attractive if the (MAC) and digital hardware calculator were able to operating parellelly to improve computation throughput of the datapath circuit. Hence, the hardware architecture or design of the MAC and the digital hardware calculator can be optimized for the differing needs of different types of mathematical functions to be executed.
U.S. Pat. No. 7,284,027 relates to methods and circuit cells performing high-speed arithmetic computations on fixed-point or floating-point numbers for real-time DSP applications. The disclosure relates to a customized multiplier architecture/topology for multiplication of fixed or floating-point complex numbers in connection with rapid FFT computations. The complex multiplier structure is based on log-domain computations to multiply complex input numbers by logarithmic add operations and subsequently apply exponentiation to return with a multiplication result in the linear domain.
U.S. Pat. No. 7,539,717 discloses methods and hardware circuit blocks using a table-based Taylor series approximation to compute logarithms of floating-point numbers in DSP applications. A number of dedicated hardware blocks, which perform certain preprocessing steps on a floating point input operand, are coupled to a floating-point FMAD which performs a final computation of a logarithm of the floating point input operand based on preprocessed compressed values. The dedicated hardware blocks comprise a first lookup table and a second lookup table storing full-precision variable and compressed variable values, respectively, associated with a logarithm function.